Follow-up Comment #2, bug #66257 (group grub): Q. What's the code logic? A. "grub_tsc_init" function will init tsc by setting grub_tsc_rate, which call stack is: grub_tsc_init -> grub_tsc_calibrate_from_pmtimer -> grub_divmod64 Among, "grub_divmod64" function needs "tsc_diff" as the second parameter. In "grub_pmtimer_wait_count_tsc", we will call "grub_get_tsc" function to get time stamp counter value to assign to "start_tsc" variable, and get into "while(1)" loop space to get "end_tsc" variable value with same function, after 3580 ticks, return "end_tsc - start_tsc". Actually, "rdtsc" instruction will be called in "grub_get_tsc", but "rdtsc" instruction is not reliable(for the reason see the next question), which will cause "tsc_diff" to be a very big number larger than (1UL << 32) or a negative number, so that grub_tsc_rate will be zero. When "run_menu" function is startup, and calls "grub_tsc_get_time_ms" function to get current time to check if timeout time reach, at this time, "grub_tsc_get_time_ms" function will return zero due to zero "grub_tsc_rate" variable, then grub menu gets stuck...
Q. What's the difference between rdtsc and rdtscp instructions in x86_64 architecture? Here is more explanations from Intel Manual per check with Intel OS expert as below: A. The RDTSC instruction is not a serializing instruction. It does not necessarily wait until all previous instructions have been executedbefore reading the counter. Similarly, subsequent instructions may begin execution before the read operation is performed. The following items may guide software seeking to order executions of RDTSC: - If software requires RDTSC to be executed only after all previouss instructions have executed and all previous loads are globally visible, 1 it can execute LFENCE immediately before RDTSC. - If software requires RDTSC to be executed only after all previouss instructions have executed and all previous loads and stores areglobally visible, it can execute the sequence MFENCE;LFENCE immediately before RDTSC. - If software requires RDTSC to be executed prior to execution of any sulbsequent instruction (including any memory accesses), it can execute the sequence LFENCE immediately after RDTSC. A. The RDTSCP instruction is not a serializing instruction, but it does wait until all previous instructions have executed and all previous loads are globally visible. 1 But it does not wait for previous stores to be globally visible, and subsequent instructions may begin execution before the read operation is performed. The following items may guide softwareseeking to order executions of RDTSCP: - If software requires RDTSCP to be executed only after all previous stores are globally visible, it can execute MFENCE immediatelybefore RDTSCP. - If software requires RDTSCP to be executed prior to execution of any subsequent instruction (including any memory accesses), itcan execute LFENCE immediately after RDTSCP. Q. Why do we do this fix? A. Changing "rdtsc" instruction to "rdtscp" to make sure do read counter operation after previous instruction have executed and all previous loads are globally visiable, that means we keep current and previous instruction serializing; add "lfence" load barrier instruction after "rdrscp" instruction, that means we keep current and subsequent instrution serializing. So we must get correct tsc value in this timeout check scenario. A. Adding "if (grub_tsc_rate == 0)" judgement just in case other unknown instruction exception, so that "grub_tsc_calibrate_from_pit||grub_tsc_calibrate_from_efi" getting "grub_tsc_rate" methods have a opportunity to be performed but causing grub menu stucking. A. After this change, we cannot reproduce this issue via 1060 AC poweron/poweroff stress test cycles. _______________________________________________________ Reply to this item at: <https://savannah.gnu.org/bugs/?66257> _______________________________________________ Message sent via Savannah https://savannah.gnu.org/
signature.asc
Description: PGP signature