Yes, I think the potentially confusing situation with this code is that it
appears we are doing two TLB lookups whenever the access crosses a
cache-line boundary, even if both of the accessed cache lines are on the
same page.  Conceptually we should only need two TLB lookups if the access
crosses a page boundary (which, as Gabe points out, implies that the access
also crosses a cache line boundary, but the converse is not true).

I think the question Nilay is asking is whether this code is doing these
unnecessary TLB lookups just to keep the code simpler, or if there is a
deeper reason why it's hard to only do two TLB lookups when absolutely
necessary.

Steve

On Thu, Jul 7, 2011 at 2:12 PM, Gabriel Michael Black <[email protected]
> wrote:

> When I did the original version of this code (since improved by others) I
> was told, by Steve I think, that accesses have to be contained in a single
> "block". The size of the peer's block is reported through the port
> interface. I think it's assumed that the page size is at least as large as a
> cache line and that all page boundaries are also "block" boundaries. This
> should be a valid assumption, although there's no true guarantee I suppose.
>
> Gabe
>
>
> Quoting Nilay Vaish <[email protected]>:
>
>  Yesterday, Brad, Steve and I were looking at code for TimingSimpleCPU.
>> There is a portion of the read/writeMem function that is not completely
>> explainable. I have copied the code below.
>>
>>    Addr split_addr = roundDown(addr + size - 1, block_size);
>>    assert(split_addr <= addr || split_addr - addr < block_size);
>>
>>    _status = DTBWaitResponse;
>>    if (split_addr > addr) {
>>        RequestPtr req1, req2;
>>        assert(!req->isLLSC() && !req->isSwap());
>>        req->splitOnVaddr(split_addr, req1, req2);
>>
>>        WholeTranslationState *state =
>>            new WholeTranslationState(req, req1, req2, new uint8_t[size],
>>                                      NULL, mode);
>>        DataTranslation<**TimingSimpleCPU> *trans1 =
>>            new DataTranslation<**TimingSimpleCPU>(this, state, 0);
>>        DataTranslation<**TimingSimpleCPU> *trans2 =
>>            new DataTranslation<**TimingSimpleCPU>(this, state, 1);
>>
>>        thread->dtb->translateTiming(**req1, tc, trans1, mode);
>>        thread->dtb->translateTiming(**req2, tc, trans2, mode);
>>    } else {
>>        WholeTranslationState *state =
>>            new WholeTranslationState(req, new uint8_t[size], NULL, mode);
>>        DataTranslation<**TimingSimpleCPU> *translation
>>            = new DataTranslation<**TimingSimpleCPU>(this, state);
>>        thread->dtb->translateTiming(**req, tc, translation, mode);
>>    }
>>
>>
>> The code calls translateTiming() either once or twice depending on whether
>> or not the memory to be read lies in a single cache block. Should not the
>> check be that whether or no the memory to be read lies in a single page?
>>
>> Thanks
>> Nilay
>> ______________________________**_________________
>> gem5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>
>>
>>
>
> ______________________________**_________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/listinfo/gem5-dev>
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to