Hi Mitch,

After I applied two patches and IsQuiesce modification, O3 CPU keeps 
in the same track as atomic CPU longer than before. But apic_timer_interrupt
function comes out again in O3 CPU. It used to come out after about 500,000
instructions, now it comes out after about 990,000 instructions.

In addition, I dump out tick numbers as well as PCs, so that I find out
there is a 459500 ticks gap between last committed user instruction and 
first instruction in apic_timer_interrupt function. This confirms that 
the last user instruction sits in commit until timer interrupt happens.
Am I right about this?

Next step, I think I need to label all x86 quiesce instructions.
Do you have a list of those instructions? Or does somewhere in 
Intel manual tell me about this?

Thanks.

--
Best Regards
Yan Zi

On 27 Aug 2014, at 15:59, Mitch Hayenga wrote:

> Yep, that should do it.
>
>
> On Wed, Aug 27, 2014 at 2:57 PM, Zi Yan <birdman...@gmail.com> wrote:
>
>> Thanks.
>>
>> I will apply 1, and 2 patches.
>>
>> For 3, I need to change the file src/arch/x86/isa/microops/specop.isa:66
>> from
>> setFlags | (ULL(1) << StaticInst::IsNonSpeculative),
>> to
>> setFlags | (ULL(1) << StaticInst::IsNonSpeculative) | (ULL(1) <<
>> StaticInst::IsQuiesce),
>>
>> Am I doing the right thing to tag "MicroHalt" instruction as "IsQuiesce"?
>>
>> BTW, what I did to boot linux is to install gentoo inside QEMU,
>> then use x86KvmCPU to boot up, then take checkpoints and run from
>> checkpoints.
>>
>> I will report whether this works or not.
>>
>> Thanks.
>>
>> --
>> Best Regards
>> Yan Zi
>>
>> On 27 Aug 2014, at 15:44, Mitch Hayenga wrote:
>>
>>> There are probably three main patches that could help.  The fact you
>>> mention the timer interrupt makes me think Andreas is right and these
>> might
>>> solve your issue.
>>>
>>> 1. http://reviews.gem5.org/r/2363/  - o3 is supposed to stop fetching
>>> instructions immediately once a quiesce instruction is encountered, some
>>> managed to sneak by.  Quiesce is used for things like sleeping until an
>>> interrupt occurs, etc.  Without this patch, we experienced the case where
>>> o3 state would get corrupted and an instruction would sit at commit until
>>> the next timer interrupt happened.  At which point taking the interrupt
>>> would clear the state and execution would continue (until this same bug
>>> happened again).
>>>
>>> 2. http://reviews.gem5.org/r/2367/  - If o3 was being drained while an
>>> interrupt occurred on x86, it could deadlock.
>>>
>>> 3. I believe this last patch will be posted in a day or two.  x86
>> currently
>>> does not tag any instruction that suspends() the CPU as a "quiesce".
>> This
>>> is required by o3 to properly operate, but not by the Atomic CPU.  This
>>> makes the issue in #1 far more likely to occur.  It's pretty amazing that
>>> x86 booted linux at all on o3 without this.  I believe this patch will be
>>> posted shortly, but otherwise you could just tag the "MicroHalt"
>>> instruction as "IsQuiesce" yourself.
>>>
>>> So a combination of those things (mainly the last one) could lead to what
>>> you are seeing.
>>>
>>>
>>> On Wed, Aug 27, 2014 at 12:59 PM, Zi Yan via gem5-users <
>> gem5-users@gem5.org
>>>> wrote:
>>>
>>>> OK. Could you please tell me which patches are there? In the
>>>> review board there are quite a lot of new patches waiting
>>>> for review.
>>>>
>>>> I can apply those patches myself and do a quick test.
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Best Regards
>>>> Yan Zi
>>>>
>>>> On 27 Aug 2014, at 13:56, Andreas Hansson wrote:
>>>>
>>>>> Hi Yan,
>>>>>
>>>>> I would suspect this is due to a bug in the X86 O3 CPU. There have been
>>>>> quite a few fixes posted on the review board for similar issues. I hope
>>>> to
>>>>> have these committed in the next week or so.
>>>>>
>>>>> Andreas
>>>>>
>>>>>
>>>>> On 27/08/2014 18:02, "Zi Yan via gem5-users" <gem5-users@gem5.org>
>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am running kmeans via hadoop in gem5 X86 FS mode. I am using
>>>>>> linux kernel 3.2.60 with configuration file linux-2.6.28.4 from
>>>>>> gem5.org.
>>>>>>
>>>>>> I take a checkpoint before a map task and put a "m5 exit" after the
>> map
>>>>>> task.
>>>>>> I am using *X86kvmCPU* to take checkpoints.
>>>>>>
>>>>>> When I restore from the same checkpoint, atomic CPU and O3 CPU give me
>>>>>> quite different executed instructions:
>>>>>> 1) atomic CPU executes about 350 million instructions, reaches "m5
>>>> exit",
>>>>>> then stops simulation.
>>>>>> 2) O3 CPU executes more than 12 billion instructions, and still not
>>>>>> reaches
>>>>>> "m5 exit" to stop the simulation.
>>>>>>
>>>>>> I dump out committed PCs from atomic CPU and O3 CPU, finding out that
>>>>>> after about 500,000 instructions, the systems behave differently,
>>>>>> where atomic CPU is still executing user code, but O3 CPU switch to
>>>>>> apic_timer_interrupt(a kernel function, it also appears in atomic CPU
>>>>>> execution, but somewhere else).
>>>>>>
>>>>>> Could anyone please give some advice about why this happen?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>> Yan Zi
>>>>>
>>>>>
>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments are
>>>> confidential and may also be privileged. If you are not the intended
>>>> recipient, please notify the sender immediately and do not disclose the
>>>> contents to any other person, use it for any purpose, or store or copy
>> the
>>>> information in any medium.  Thank you.
>>>>>
>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>>> Registered in England & Wales, Company No:  2557590
>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>>> 9NJ, Registered in England & Wales, Company No:  2548782
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to