Steve Reinhardt wrote: > On Sat, Jul 10, 2010 at 11:37 PM, Gabe Black <[email protected]> wrote: > >> In ARM's SIMD instruction set extension Neon, there are some >> instructions which can load or store 3 of something, and that something >> can be 1, 2, 4, or 8 bytes. To implement this properly, I'm planning to >> add readBytes and writeBytes functions to the various ExecContexts which >> would load/store some arbitrary (but practically bounded) number of >> bytes without performing endian conversion. As a side benefit, this >> should, I think, let us get rid of the twin data types in SPARC and the >> various explicit template instantiations used to support them. I think >> something like this came up for MIPS too, although I don't remember the >> particulars. >> > > How did we end up handling the MIPS instructions? I don't recall. >
Me either. Do you remember, Korey? Or whoever handled that if it wasn't Korey? > >> As I've been thinking about the best way to do this, I realized that the >> alignment of the incoming data won't necessarily be appropriate, and so >> you can't cast it blindly to whatever type you were really after. The >> extra copy would be nice to avoid, so I was thinking maybe these >> functions would take in the buffer to load/store which would have the >> proper alignment, and that would make its way through the memory system. >> I don't know what would happen to it, though, on its way. Would it be >> deleted and reallocated at some point, spoiling the alignment again? The >> initial data would have to be dynamically allocated to keep the static >> inst static and since the stack frame would be deallocated between >> initiateAcc and completeAcc. Would the call to "new" totally outweigh >> the benefit of not copying? Any thoughts? >> > > You lost me in this paragraph... are you talking about alignment for > the host system or the simulated system? What extra copy are you > referring to? > Lets say I'm trying to read 3 uint16_ts. If I take the data from the packet and just case it to a uint16_t *, the host memory used as the packet payload may not be 2 byte aligned. What I'd need to do to ensure I can pass around and/or manipulate each uint16_t without hitting some sort of problem on less lenient host architectures is to copy the ambiguously aligned packet payload into an unambiguously aligned array of uint16_ts first. What I was hoping might be possible is to send in a uint16_t array as the payload to start with so that even though it ends up being treated as a generic blob of bytes, when it comes back it has the right alignment and can be used in place. I don't know if that'd work out, though, and the more I think about it the more doubtful I am. Although from what I remember having briefly looked at the read/write functions that are already there, they do something like that. They do the "new" inside the read/write, but readBytes/writeBytes would do it in the instruction object. Gabe _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
