It looks like the right place to place "the code that checks for a fault and
calls the CPU read/write function" would be
BaseDynInst<Impl>::finishTranslation().

All the code to make this work seems to be there already. If it hits in the
TLB, then TheISA::TLB::translateTiming() should call translation->finish()
right away. I checked alpha, x86 and ARM and they do. It would execute
the "the code" at the end of the initiateTranslation()->translateTiming()
call chain, which is effectively the same as now where "the code" is
executed right after initiateTranslation() returns.

In case of a TLB miss,
1) for alpha (and other sw-handling arch), it would call
translation->finish() with a fault, which can be handled in
finishTranslation() the same way

2) for archs that do hw page-table walker,

 a) memory is timing, then translation->finish() is called when the walk is
finished. x86 seems to have the code for this Walker::recvTiming(), ARM has
the code and it's working with TimingSimpleCPU.

 b) memory is atomic (is it a possible combination? dyn_inst + atomic?) -
x86 doesn't seem to have code for this case - Walker::recvAtomic() does
nothing.

Sounds safe?

Thanks,

Min


On Tue, Jul 13, 2010 at 6:19 PM, Gabriel Michael Black <
gbl...@eecs.umich.edu> wrote:

> I think you've mostly interpretted this correctly. The instructions aren't
> retried if the translation fails, they just hang around and wait for it. The
> check if fault == NoFault will work if the translation is finished by the
> time initiateTranslation is done. That's true for everything we have now
> except x86 and ARM, neither of which is currently supported by O3. What
> might work to fix this is to move the code that checks for a fault and calls
> the CPU read/write function into the callback itself. That way once
> translation is done, whenever that may be, the correct action will happen.
>
> Gabe
>
>
>
> Quoting Min Kyu Jeong <mkje...@gmail.com>:
>
>  Thanks, Tim
>>
>> It looks like the for the DTLB translation, some code is there to handle
>> this but not complete, for the ISAs that does hardware page table walk.
>>
>> cpu/base_dyn_inst.hh
>> BaseDynInst<Impl>::read(Addr addr, T &data, unsigned flags)
>> {
>> ...
>>    initiateTranslation(req, sreqLow, sreqHigh, NULL, BaseTLB::Read);
>>
>>    if (fault == NoFault) {
>>        effAddr = req->getVaddr();
>>        effAddrValid = true;
>>        fault = cpu->read(req, sreqLow, sreqHigh, data, lqIdx);
>>    } else {
>> ...
>>        this->setExecuted();
>>    }
>>
>> It first initiate translation, and would call cpu->read() as long as a
>> fault
>> has not been generated during the translation. This should work for the
>> Alpha, where TLB miss is treated as fault and handled in software by
>> PALcode. Alpha TLB returns a fault in case of a miss.
>>
>> For the ISAs that does hardware page-table walk, the TLB-miss instruction
>> shout not either start a read (cpu->read()) or taken out of the
>> instruction
>> window (this->setExecuted()). I think it should wait for the table walk to
>> finish and retry the execution of the load/store (it might be not true
>> depending on the implementation??)
>>
>> I looked into the x86 code, and if the memory is timing, then the
>> pagetable
>> walker would initiate a memory access and return without a fault - it
>> means
>> the cpu->read() would be called w/o the translation finished. It is the
>> same
>> case for the Arm.
>>
>> Is there any plan or ongoing effort to support this wait-on-TLB-miss on
>> the
>> other ISAs? or ideas about how to go about implementing it?
>>
>> Thanks,
>>
>> Min
>>
>> On Mon, Jul 12, 2010 at 5:44 PM, Timothy M Jones <tjon...@inf.ed.ac.uk
>> >wrote:
>>
>>  Hi Min,
>>>
>>> The way that the TLB deals with a timing translation is specific to each
>>> ISA.  I don't have much experience with anything other than Power but for
>>> that ISA, yes, you're correct.  The timing translation is just a wrapper
>>> around the atomic translation.  It seems from a quick check that Alpha is
>>> the same.
>>>
>>> If you actually wanted to have the fetch translation finish on a
>>> different
>>> cycle to the one it was initiated on then you would have to make some
>>> changes to the fetch stage to allow that.  I wouldn't have thought it
>>> would
>>> be too difficult but might require splitting up several functions into
>>> code
>>> that's executed before the translation and code that's executed
>>> afterwards.
>>>
>>> Cheers
>>> Tim
>>>
>>>
>>> On 12/07/2010 18:14, Min Kyu Jeong wrote:
>>>
>>>  Hi,
>>>>
>>>> This question is regarding the changeset
>>>> (http://repo.m5sim.org/m5?cmd=changeset;node=a123bd350935).
>>>>
>>>>   This initiates a timing translation and passes the read or write on
>>>>   to the
>>>>
>>>>   processor before waiting for it to finish
>>>>
>>>>
>>>> It looks like even in the event of TLB miss, TLB-walk does not delay the
>>>> actual execution of the loads. Am I correct?
>>>>
>>>> I was trying to find a reference for replacing the translateAtomic() in
>>>> the fetch stage w/ translateTIming(). It would require some mechanism to
>>>> stop the actual fetch until the translation is finished - which doesn't
>>>> seem to exist in the O3 CPU even for the data translation.
>>>>
>>>> Thanks,
>>>>
>>>> Min
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> m5-dev mailing list
>>>> m5-dev@m5sim.org
>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>
>>>>
>>> --
>>> Timothy M. Jones
>>> http://homepages.inf.ed.ac.uk/tjones1
>>>
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>> _______________________________________________
>>> m5-dev mailing list
>>> m5-dev@m5sim.org
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>
>>>
>>
>
>
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to