On Wed, Jun 14, 2006 at 01:25:22 +1000, Erik de Castro Lopo wrote: > The binary in question is a complete file system. As such it is
Didn't James say it was a compressed file system? If so, it's simply a stream of bytes, not a mixture of different data types. The compression algorithm doesn't know the meaning or size of each bit of data in the filesystem, it just treats it as a stream of bytes. It might read those bytes as 32-bit words, or it might read them as individual bytes, but it doesn't matter. It just processes what it sees as a chunk of random unstructured data. Once it's uncompressed, nothing changes. The kernel, when reading a file system, doesn't read individual 8, 16 and 32 bit fields off the disc, it reads a chunk of data (probably some number of sectors) then tries to make sense of it. By then, it's been copied at least once (maybe via dma) by code which has no knowledge of the underlying structure of the data it's processing. The different fields don't have meaning until they're interpretted by the application using them. Now, I don't know whether byte swapping is needed before or after uncompressing, both, or neither, but I don't believe any knowledge of the underlying structure of the data is necessary to do that byte swapping. Let me give you an example which may help you understand why I think this. I write processor models (x86 little endian host) for a living, and have worked on both big and little endian cores, including some that can switch endianness at run time. When we load a target binary, all we know is that it's an ELF (or S record, or whatever) image and that it's big or little endian, but not the meaning of each part of the image. For our ARM models, and most other big-endian cores, we simply byte swap big-endian data as aligned 32-bit words, and store it in little endian format in host memory. We store the data in host endian format and do the conversion when it's read depending upon the current processor mode. Unaligned accesses complicate the process[0], but are irrelevant for the purpose of this example. When we do the initial byte swapping, we neither know nor care what the meaning or size of each individual location in the binary is, we simply treat it as a collection of 32-bit numbers and everything just works. I think this file system image can be treated the same way, at least until you actually want to mount it and interpret the contents. Then, and only then, do you need to know how big each field is. Cheers, John [0] Some cores raise an exception, some do an aligned access and rotate the data, some ignore the least signifcant address bits to force alignment and some do multiple bus cyles to handle the unaligned access. -- > Hmph, whatever happened to *ethics*? - That's what I'd like to know! It's still there, just to the north of Kent. Nasty elisp you've got there.... -- Sean Purdy -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html