Re: [RFC] Documentation about unaligned memory access

Dmitri Vorobiev Fri, 23 Nov 2007 14:53:31 -0800

Daniel Drake пишет:
> Being spoilt by the luxuries of i386/x86_64 I've never really had a good
> grasp on unaligned memory access problems on other architectures and decided
> it was time to figure it out. As a result I've written this documentation
> which I plan to submit for inclusion as
> Documentation/unaligned_memory_access.txt
> 
> Before I do so, any comments on the following?


>From the viewpoint of yours truly (and I am a teacher of operating system 
>classes), this is a long-expected document, which is going to be very useful 
>especially for newbies. My students often make alignment mistakes in their 
>code, and your article will definitely make my job much easier.

Thank you, Daniel, for your work.

Dmitri

> 
> Thanks,
> Daniel
> 
> 
> 
> 
> UNALIGNED MEMORY ACCESSES
> =========================
> 
> Linux runs on a wide variety of architectures which have varying behaviour
> when it comes to memory access. This document presents some details about
> unaligned accesses, why you need to write code that doesn't cause them,
> and how to write such code!
> 
> 
> What's the definition of an unaligned access?
> =============================================
> 
> Unaligned memory accesses occur when you try to read N bytes of data starting
> from an address that is not evenly divisible by N (i.e. addr % N != 0).
> For example, reading 4 bytes of data from address 0x10000004 is fine, but
> reading 4 bytes of data from address 0x10000005 would be an unaligned memory
> access.
> 
> 
> Why unaligned access is bad
> ===========================
> 
> Most architectures are unable to perform unaligned memory accesses. Any
> unaligned access causes a processor exception.
> 
> Some architectures have an exception handler implemented in the kernel which
> corrects the memory access, but this is very expensive and is not true for
> all architectures. You cannot rely on the exception handler to correct your
> memory accesses.
> 
> In summary: if your code causes unaligned memory accesses to happen, your code
> will not work on some platforms, and will perform *very* badly on others.
> 
> You may be wondering why you have never seen these problems on your own
> architecture. Some architectures (such as i386 and x86_64) do not have this
> limitation, but nevertheless it is important for you to write portable code
> that works everywhere.
> 
> 
> Natural alignment
> =================
> 
> The rule we mentioned earlier forms what we refer to as natural alignment:
> When accessing N bytes of memory, the base memory address must be evenly
> divisible by N, i.e. addr % N == 0
> 
> When writing code, assume the target architecture has natural alignment
> requirements.
> 
> Sidenote: in reality, only a few architectures require natural alignment
> on all sizes of memory access. However, again we must consider ALL supported
> architectures; natural alignment is the only way to achieve full portability.
> 
> 
> Code that doesn't cause unaligned access
> ========================================
> 
> At first, the concepts above may seem a little hard to relate to actual
> coding practice. After all, you don't have a great deal of control over
> memory addresses of certain variables, etc.
> 
> Fortunately things are not too complex, as in most cases, the compiler
> ensures that things will work for you. For example, take the following
> structure:
> 
>       struct foo {
>               u16 field1;
>               u32 field2;
>               u8 field3;
>       };
> 
> Let us assume that an instance of the above structure resides in memory
> starting at address 0x10000000. With a basic level of understanding, it would
> not be unreasonable to expect that accessing field2 would cause an unaligned
> access. You'd be expecting field2 to be located at offset 2 bytes into the
> structure, i.e. address 0x10000002, but that address is not evenly divisible
> by 4 (remember, we're reading a 4 byte value here).
> 
> Fortunately, the compiler understands the alignment constraints, so in the
> above case it would insert 2 bytes of padding inbetween field1 and field2.
> Therefore, for standard structure types you can always rely on the compiler
> to pad structures so that accesses to fields are suitably aligned (assuming
> you do not cast the field to a type of different length).
> 
> Similarly, you can also rely on the compiler to align variables and function
> parameters to a naturally aligned scheme, based on the size of the type of
> the variable.
> 
> Sidenote: in the above example, you may wish to reorder the fields in the
> above structure so that the overall structure uses less memory. For example,
> moving field3 to sit inbetween field1 and field2 (where the padding is
> inserted) would shrink the overall structure by 1 byte:
> 
>       struct foo {
>               u16 field1;
>               u8 field3;
>               u32 field2;
>       };
> 
> Sidenote: it should be obvious by now, but in case it is not, accessing a
> single byte (u8 or char) can never cause an unaligned access, because all
> memory addresses are evenly divisible by 1.
> 
> 
> Code that causes unaligned access
> =================================
> 
> With the above in mind, let's move onto a real life example of a function
> that can cause an unaligned memory access. The following function adapted
> from include/linux/etherdevice.h is an optimized routine to compare two
> ethernet MAC addresses for equality.
> 
> unsigned int compare_ether_addr(const u8 *addr1, const u8 *addr2)
> {
>       const u16 *a = (const u16 *) addr1;
>       const u16 *b = (const u16 *) addr2;
>       return ((a[0] ^ b[0]) | (a[1] ^ b[1]) | (a[2] ^ b[2])) != 0;
> }
> 
> In the above function, the reference to a[0] causes 2 bytes (16 bits) to
> be read from memory starting at address addr1. Think about what would happen
> if addr1 was an odd address, such as 0x10000003. (Hint: it'd be an unaligned
> access)
> 
> Despite the potential unaligned access problems with the above function, it
> is included in the kernel anyway but is documented to only work on
> 16-bit-aligned addresses. It is up to the caller to ensure this alignment or
> not use this function at all. This alignment-unsafe function is still useful
> as it is a decent optimization for the cases when you can ensure alignment.
> 
> 
> Here is another example of code that could cause unaligned accesses:
>       void myfunc(u8 *data, u32 value)
>       {
>               [...]
>               *((u32 *) data) = cpu_to_le32(value);
>               [...]
>       }
> 
> This code will cause unaligned accesses every time the data parameter points
> to an address that is not evenly divisible by 4.
> 
> 
> Consider the following structure:
>       struct foo {
>               u16 field1;
>               u32 field2;
>               u8 field3;
>       } __attribute__((packed));
> 
> It's the same structure as we looked at earlier, but the packed attribute has
> been added. This attribute ensures that the compiler never inserts any padding
> and the structure is laid out in memory exactly as is suggested above.
> 
> The packed attribute is useful when you want to use a C struct to represent
> some data that comes in a fixed arrangement 'off the wire'.
> 
> It should be clear why accessing fields of an instance of that structure could
> cause unaligned accesses in some situations. Even if the instance started at
> an address such as 0x10000000 where accessing field1 would not cause an
> unaligned access, accessing field2 would be reading 4 bytes from 0x10000002,
> which, is an unaligned access. The compiler didn't jump to your rescue and 
> insert padding because you asked it not to.
> 
> 
> In summary, the 3 main scenarios where you may run into unaligned access
> problems involve:
>  1. Recasting variables to types of different lengths
>  2. Pointer arithmetic followed by access to at least 2 bytes of data
>  3. Accessing elements of packed structures
> 
> 
> Avoiding unaligned accesses
> ===========================
> 
> Going back to an earlier example:
>       void myfunc(u8 *data, u32 value)
>       {
>               [...]
>               *((u16 *) data) = cpu_to_le32(value);
>               [...]
>       }
> 
> To avoid the unaligned memory access, you could rewrite it as follows:
> 
>       void myfunc(u8 *data, u32 value)
>       {
>               [...]
>               value = cpu_to_le32(value);
>               memcpy(data, value, sizeof(value));
>               [...]
>       }
> 
> It's safe to assume that memcpy will always copy bytewise and hence will
> never cause an unaligned access.
> 
> 
> Recall an example packed structure from earlier:
> 
>       struct foo {
>               u16 field1;
>               u32 field2;
>               u8 field3;
>       } __attribute__((packed));
> 
> The following code will potentially cause 2 unaligned accesses: writing to
> field2, then reading from field2:
> 
>       void myfunc2(u32 some_data)
>       {
>               struct foo myinstance;
>               u32 tmp;
> 
>               myinstance.field2 = some_data;
>               tmp = myinstance.field2 * 2;
>       }
> 
> When writing this code, you should be aware that field2 acccesses are
> potentially unaligned therefore the above will break on some systems. The
> kernel provides two macros to simplify handling of situations such as the
> above:
> 
>       void myfunc2(u32 some_data)
>       {
>               struct foo myinstance;
>               u32 tmp;
> 
>               put_unaligned(tmp, &myinstance.field2);
>               tmp = get_unaligned(&myinstance.field2);
>       }
> 
> These macros work from pointers to the unaligned data, and work for memory
> accesses of any length (not just 32 bits as in the example above). You could
> even use put_unaligned() rather than memcpy() in order to solve the bug in
> the first example (myfunc()) given above.
> 
> --
> Author: Daniel Drake <[EMAIL PROTECTED]>
> With help from: Johannes Berg, Uli Kunitz.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

Reply via email to