Steve Reinhardt wrote:
> On Sat, Jul 10, 2010 at 11:37 PM, Gabe Black <[email protected]> wrote:
>   
>> In ARM's SIMD instruction set extension Neon, there are some
>> instructions which can load or store 3 of something, and that something
>> can be 1, 2, 4, or 8 bytes. To implement this properly, I'm planning to
>> add readBytes and writeBytes functions to the various ExecContexts which
>> would load/store some arbitrary (but practically bounded) number of
>> bytes without performing endian conversion. As a side benefit, this
>> should, I think, let us get rid of the twin data types in SPARC and the
>> various explicit template instantiations used to support them. I think
>> something like this came up for MIPS too, although I don't remember the
>> particulars.
>>     
>
> How did we end up handling the MIPS instructions?  I don't recall.
>   

Me either. Do you remember, Korey? Or whoever handled that if it wasn't
Korey?

>   
>> As I've been thinking about the best way to do this, I realized that the
>> alignment of the incoming data won't necessarily be appropriate, and so
>> you can't cast it blindly to whatever type you were really after. The
>> extra copy would be nice to avoid, so I was thinking maybe these
>> functions would take in the buffer to load/store which would have the
>> proper alignment, and that would make its way through the memory system.
>> I don't know what would happen to it, though, on its way. Would it be
>> deleted and reallocated at some point, spoiling the alignment again? The
>> initial data would have to be dynamically allocated to keep the static
>> inst static and since the stack frame would be deallocated between
>> initiateAcc and completeAcc. Would the call to "new" totally outweigh
>> the benefit of not copying? Any thoughts?
>>     
>
> You lost me in this paragraph... are you talking about alignment for
> the host system or the simulated system?  What extra copy are you
> referring to?
>   

Lets say I'm trying to read 3 uint16_ts. If I take the data from the
packet and just case it to a uint16_t *, the host memory used as the
packet payload may not be 2 byte aligned. What I'd need to do to ensure
I can pass around and/or manipulate each uint16_t without hitting some
sort of problem on less lenient host architectures is to copy the
ambiguously aligned packet payload into an unambiguously aligned array
of uint16_ts first. What I was hoping might be possible is to send in a
uint16_t array as the payload to start with so that even though it ends
up being treated as a generic blob of bytes, when it comes back it has
the right alignment and can be used in place. I don't know if that'd
work out, though, and the more I think about it the more doubtful I am.
Although from what I remember having briefly looked at the read/write
functions that are already there, they do something like that. They do
the "new" inside the read/write, but readBytes/writeBytes would do it in
the instruction object.

Gabe
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to