Hi Yan,

Check out: http://reviews.gem5.org/r/2369/

Perhaps the problem you are struggling with is even more complex, but at
least the patches on the review board should fix up a few issues.

Andreas

On 28/08/2014 03:27, "Zi Yan via gem5-users" <gem5-users@gem5.org> wrote:

>Hi Mitch,
>
>After I applied two patches and IsQuiesce modification, O3 CPU keeps
>in the same track as atomic CPU longer than before. But
>apic_timer_interrupt
>function comes out again in O3 CPU. It used to come out after about
>500,000
>instructions, now it comes out after about 990,000 instructions.
>
>In addition, I dump out tick numbers as well as PCs, so that I find out
>there is a 459500 ticks gap between last committed user instruction and
>first instruction in apic_timer_interrupt function. This confirms that
>the last user instruction sits in commit until timer interrupt happens.
>Am I right about this?
>
>Next step, I think I need to label all x86 quiesce instructions.
>Do you have a list of those instructions? Or does somewhere in
>Intel manual tell me about this?
>
>Thanks.
>
>--
>Best Regards
>Yan Zi
>
>On 27 Aug 2014, at 15:59, Mitch Hayenga wrote:
>
>> Yep, that should do it.
>>
>>
>> On Wed, Aug 27, 2014 at 2:57 PM, Zi Yan <birdman...@gmail.com> wrote:
>>
>>> Thanks.
>>>
>>> I will apply 1, and 2 patches.
>>>
>>> For 3, I need to change the file
>>>src/arch/x86/isa/microops/specop.isa:66
>>> from
>>> setFlags | (ULL(1) << StaticInst::IsNonSpeculative),
>>> to
>>> setFlags | (ULL(1) << StaticInst::IsNonSpeculative) | (ULL(1) <<
>>> StaticInst::IsQuiesce),
>>>
>>> Am I doing the right thing to tag "MicroHalt" instruction as
>>>"IsQuiesce"?
>>>
>>> BTW, what I did to boot linux is to install gentoo inside QEMU,
>>> then use x86KvmCPU to boot up, then take checkpoints and run from
>>> checkpoints.
>>>
>>> I will report whether this works or not.
>>>
>>> Thanks.
>>>
>>> --
>>> Best Regards
>>> Yan Zi
>>>
>>> On 27 Aug 2014, at 15:44, Mitch Hayenga wrote:
>>>
>>>> There are probably three main patches that could help.  The fact you
>>>> mention the timer interrupt makes me think Andreas is right and these
>>> might
>>>> solve your issue.
>>>>
>>>> 1. http://reviews.gem5.org/r/2363/  - o3 is supposed to stop fetching
>>>> instructions immediately once a quiesce instruction is encountered,
>>>>some
>>>> managed to sneak by.  Quiesce is used for things like sleeping until
>>>>an
>>>> interrupt occurs, etc.  Without this patch, we experienced the case
>>>>where
>>>> o3 state would get corrupted and an instruction would sit at commit
>>>>until
>>>> the next timer interrupt happened.  At which point taking the
>>>>interrupt
>>>> would clear the state and execution would continue (until this same
>>>>bug
>>>> happened again).
>>>>
>>>> 2. http://reviews.gem5.org/r/2367/  - If o3 was being drained while an
>>>> interrupt occurred on x86, it could deadlock.
>>>>
>>>> 3. I believe this last patch will be posted in a day or two.  x86
>>> currently
>>>> does not tag any instruction that suspends() the CPU as a "quiesce".
>>> This
>>>> is required by o3 to properly operate, but not by the Atomic CPU.
>>>>This
>>>> makes the issue in #1 far more likely to occur.  It's pretty amazing
>>>>that
>>>> x86 booted linux at all on o3 without this.  I believe this patch
>>>>will be
>>>> posted shortly, but otherwise you could just tag the "MicroHalt"
>>>> instruction as "IsQuiesce" yourself.
>>>>
>>>> So a combination of those things (mainly the last one) could lead to
>>>>what
>>>> you are seeing.
>>>>
>>>>
>>>> On Wed, Aug 27, 2014 at 12:59 PM, Zi Yan via gem5-users <
>>> gem5-users@gem5.org
>>>>> wrote:
>>>>
>>>>> OK. Could you please tell me which patches are there? In the
>>>>> review board there are quite a lot of new patches waiting
>>>>> for review.
>>>>>
>>>>> I can apply those patches myself and do a quick test.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Best Regards
>>>>> Yan Zi
>>>>>
>>>>> On 27 Aug 2014, at 13:56, Andreas Hansson wrote:
>>>>>
>>>>>> Hi Yan,
>>>>>>
>>>>>> I would suspect this is due to a bug in the X86 O3 CPU. There have
>>>>>>been
>>>>>> quite a few fixes posted on the review board for similar issues. I
>>>>>>hope
>>>>> to
>>>>>> have these committed in the next week or so.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>>
>>>>>> On 27/08/2014 18:02, "Zi Yan via gem5-users" <gem5-users@gem5.org>
>>>>> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I am running kmeans via hadoop in gem5 X86 FS mode. I am using
>>>>>>> linux kernel 3.2.60 with configuration file linux-2.6.28.4 from
>>>>>>> gem5.org.
>>>>>>>
>>>>>>> I take a checkpoint before a map task and put a "m5 exit" after the
>>> map
>>>>>>> task.
>>>>>>> I am using *X86kvmCPU* to take checkpoints.
>>>>>>>
>>>>>>> When I restore from the same checkpoint, atomic CPU and O3 CPU
>>>>>>>give me
>>>>>>> quite different executed instructions:
>>>>>>> 1) atomic CPU executes about 350 million instructions, reaches "m5
>>>>> exit",
>>>>>>> then stops simulation.
>>>>>>> 2) O3 CPU executes more than 12 billion instructions, and still not
>>>>>>> reaches
>>>>>>> "m5 exit" to stop the simulation.
>>>>>>>
>>>>>>> I dump out committed PCs from atomic CPU and O3 CPU, finding out
>>>>>>>that
>>>>>>> after about 500,000 instructions, the systems behave differently,
>>>>>>> where atomic CPU is still executing user code, but O3 CPU switch to
>>>>>>> apic_timer_interrupt(a kernel function, it also appears in atomic
>>>>>>>CPU
>>>>>>> execution, but somewhere else).
>>>>>>>
>>>>>>> Could anyone please give some advice about why this happen?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>> Yan Zi
>>>>>>
>>>>>>
>>>>>> -- IMPORTANT NOTICE: The contents of this email and any attachments
>>>>>>are
>>>>> confidential and may also be privileged. If you are not the intended
>>>>> recipient, please notify the sender immediately and do not disclose
>>>>>the
>>>>> contents to any other person, use it for any purpose, or store or
>>>>>copy
>>> the
>>>>> information in any medium.  Thank you.
>>>>>>
>>>>>> ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ,
>>>>> Registered in England & Wales, Company No:  2557590
>>>>>> ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1
>>>>> 9NJ, Registered in England & Wales, Company No:  2548782
>>>>>
>>>>> _______________________________________________
>>>>> gem5-users mailing list
>>>>> gem5-users@gem5.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered 
in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, 
Registered in England & Wales, Company No:  2548782

_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to