On 11/26/25 13:23, david laight wrote:
On Wed, 26 Nov 2025 16:12:23 +0100
Helge Deller <[email protected]> wrote:
* david laight <[email protected]>:
On Wed, 26 Nov 2025 12:03:03 +0100
Helge Deller <[email protected]> wrote:
On 11/26/25 11:44, david laight wrote:
...
diff --git a/security/apparmor/match.c b/security/apparmor/match.c
index 26e82ba879d44..3dcc342337aca 100644
--- a/security/apparmor/match.c
+++ b/security/apparmor/match.c
@@ -71,10 +71,10 @@ static struct table_header *unpack_table(char *blob, size_t
bsize)
u8, u8, byte_to_byte);
Is that that just memcpy() ?
No, it's memcpy() only on big-endian machines.
You've misread the quoting...
The 'data8' case that was only half there is a memcpy().
On little-endian machines it converts from big-endian
16/32-bit ints to little-endian 16/32-bit ints.
But I see some potential for optimization here:
a) on big-endian machines just use memcpy()
true
b) on little-endian machines use memcpy() to copy from possibly-unaligned
memory to then known-to-be-aligned destination. Then use a loop with
be32_to_cpu() instead of get_unaligned_xx() as it's faster.
There is a function that does a loop byteswap of a buffer - no reason
to re-invent it.
I assumed there must be something, but I did not see it. Which one?
I can't find it either - just some functions to do an in-place swap.
(Which aren't usually a good idea)
But I doubt it is always (if ever) faster to do a copy and then byteswap.
The loop control and extra memory accesses kill performance.
Yes, you are probably right.
Not that I've seen a fast get_unaligned() - I don't think gcc or clang
generate optimal code - For LE I think it is something like:
low = *(addr & ~3);
high = *((addr + 3) & ~3);
shift = (addr & 3) * 8;
value = low << shift | high >> (32 - shift);
Note that it is only 2 aligned memory reads - even for 64bit.
Ok, then maybe we should keep it simple like this patch:
[PATCH v2] apparmor: Optimize table creation from possibly unaligned memory
Source blob may come from userspace and might be unaligned.
Try to optize the copying process by avoiding unaligned memory accesses.
Not sure that reads right.
'Allow for misaligned data from userspace when byteswapping source blob' ?
Signed-off-by: Helge Deller <[email protected]>
diff --git a/security/apparmor/include/match.h
b/security/apparmor/include/match.h
index 1fbe82f5021b..386da2023d50 100644
--- a/security/apparmor/include/match.h
+++ b/security/apparmor/include/match.h
@@ -104,16 +104,20 @@ struct aa_dfa {
struct table_header *tables[YYTD_ID_TSIZE];
};
-#define byte_to_byte(X) (X)
+#define byte_to_byte(X) (*(X))
It's a bit of a shame you need something for the above...
We got rid of that in the last patch by just replacing the call to
UNPACK_ARRAY for bytes with just a call to memcpy.
David
#define UNPACK_ARRAY(TABLE, BLOB, LEN, TTYPE, BTYPE, NTOHX) \
do { \
typeof(LEN) __i; \
TTYPE *__t = (TTYPE *) TABLE; \
BTYPE *__b = (BTYPE *) BLOB; \
- for (__i = 0; __i < LEN; __i++) { \
- __t[__i] = NTOHX(__b[__i]); \
- } \
+ BUILD_BUG_ON(sizeof(TTYPE) != sizeof(BTYPE)); \
+ if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN) || sizeof(BTYPE) == 1) \
+ memcpy(__t, __b, (LEN) * sizeof(BTYPE)); \
+ else /* copy & convert convert from big-endian */ \
+ for (__i = 0; __i < LEN; __i++) { \
+ __t[__i] = NTOHX(&__b[__i]); \
+ } \
} while (0)
static inline size_t table_size(size_t len, size_t el_size)
diff --git a/security/apparmor/match.c b/security/apparmor/match.c
index c5a91600842a..13e2f6873329 100644
--- a/security/apparmor/match.c
+++ b/security/apparmor/match.c
@@ -15,6 +15,7 @@
#include <linux/vmalloc.h>
#include <linux/err.h>
#include <linux/kref.h>
+#include <linux/unaligned.h>
#include "include/lib.h"
#include "include/match.h"
@@ -70,10 +71,10 @@ static struct table_header *unpack_table(char *blob, size_t
bsize)
u8, u8, byte_to_byte);
else if (th.td_flags == YYTD_DATA16)
UNPACK_ARRAY(table->td_data, blob, th.td_lolen,
- u16, __be16, be16_to_cpu);
+ u16, __be16, get_unaligned_be16);
else if (th.td_flags == YYTD_DATA32)
UNPACK_ARRAY(table->td_data, blob, th.td_lolen,
- u32, __be32, be32_to_cpu);
+ u32, __be32, get_unaligned_be32);
else
goto fail;
/* if table was vmalloced make sure the page tables are synced