Re: [PATCH] connector: improved unaligned access error fix
From: Chris Metcalf Date: Thu, 14 Nov 2013 12:09:21 -0500 > In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access > bug for ia64 by converting a 64-bit write to use put_unaligned(). > Unfortunately, since gcc will convert a short memset() to a series > of appropriately-aligned stores, the problem is now visible again > on tilegx, where the memset that zeros out proc_event is converted > to three 64-bit stores, causing an unaligned access panic. > > A better fix for the original problem is to ensure that proc_event > is aligned to 8 bytes here. We can do that relatively easily by > arranging to start the struct cn_msg aligned to 8 bytes and then > offset by 4 bytes. Doing so means that the immediately following > proc_event structure is then correctly aligned to 8 bytes. > > The result is that the memset() stores are now aligned, and as an > added benefit, we can remove the put_unaligned() calls in the code. > > Signed-off-by: Chris Metcalf This looks fine to me, applied and queued up for -stable, thanks Chris. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] connector: improved unaligned access error fix
On 11/14/2013 2:45 PM, Pete Zaitcev wrote: > On Thu, 14 Nov 2013 12:09:21 -0500 > Chris Metcalf wrote: > >> -__u8 buffer[CN_PROC_MSG_SIZE]; >> +__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); >> -msg = (struct cn_msg *)buffer; >> +msg = buffer_to_cn_msg(buffer); >> ev = (struct proc_event *)msg->data; >> memset(>event_data, 0, sizeof(ev->event_data)); > Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable? That would be fine from a correctness point of view; I'm happy either way. My patch nominally has better performance, for what that's worth, since the memset() call is for a smaller range (24 bytes instead of 60). It also avoids the need for put_unaligned(), which even on platforms that allow unaligned stores can still be slower. I can certainly do a v2 with the larger memset() instead if that's the consensus. -- Chris Metcalf, Tilera Corp. http://www.tilera.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] connector: improved unaligned access error fix
On Thu, 14 Nov 2013 12:09:21 -0500 Chris Metcalf wrote: > - __u8 buffer[CN_PROC_MSG_SIZE]; > + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); > - msg = (struct cn_msg *)buffer; > + msg = buffer_to_cn_msg(buffer); > ev = (struct proc_event *)msg->data; > memset(>event_data, 0, sizeof(ev->event_data)); Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable? -- Pete -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] connector: improved unaligned access error fix
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access bug for ia64 by converting a 64-bit write to use put_unaligned(). Unfortunately, since gcc will convert a short memset() to a series of appropriately-aligned stores, the problem is now visible again on tilegx, where the memset that zeros out proc_event is converted to three 64-bit stores, causing an unaligned access panic. A better fix for the original problem is to ensure that proc_event is aligned to 8 bytes here. We can do that relatively easily by arranging to start the struct cn_msg aligned to 8 bytes and then offset by 4 bytes. Doing so means that the immediately following proc_event structure is then correctly aligned to 8 bytes. The result is that the memset() stores are now aligned, and as an added benefit, we can remove the put_unaligned() calls in the code. Signed-off-by: Chris Metcalf --- drivers/connector/cn_proc.c | 72 ++--- 1 file changed, 42 insertions(+), 30 deletions(-) diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c index c73fc2b74de2..18c5b9b16645 100644 --- a/drivers/connector/cn_proc.c +++ b/drivers/connector/cn_proc.c @@ -32,11 +32,23 @@ #include #include -#include - #include -#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event)) +/* + * Size of a cn_msg followed by a proc_event structure. Since the + * sizeof struct cn_msg is a multiple of 4 bytes, but not 8 bytes, we + * add one 4-byte word to the size here, and then start the actual + * cn_msg structure 4 bytes into the stack buffer. The result is that + * the immediately following proc_event structure is aligned to 8 bytes. + */ +#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event) + 4) + +/* See comment above; we test our assumption about sizeof struct cn_msg here. */ +static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer) +{ + BUILD_BUG_ON(sizeof(struct cn_msg) != 20); + return (struct cn_msg *)(buffer + 4); +} static atomic_t proc_event_num_listeners = ATOMIC_INIT(0); static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC }; @@ -56,19 +68,19 @@ void proc_fork_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); struct timespec ts; struct task_struct *parent; if (atomic_read(_event_num_listeners) < 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg->data; memset(>event_data, 0, sizeof(ev->event_data)); get_seq(>seq, >cpu); ktime_get_ts(); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns); + ev->timestamp_ns = timespec_to_ns(); ev->what = PROC_EVENT_FORK; rcu_read_lock(); parent = rcu_dereference(task->real_parent); @@ -91,17 +103,17 @@ void proc_exec_connector(struct task_struct *task) struct cn_msg *msg; struct proc_event *ev; struct timespec ts; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); if (atomic_read(_event_num_listeners) < 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg->data; memset(>event_data, 0, sizeof(ev->event_data)); get_seq(>seq, >cpu); ktime_get_ts(); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns); + ev->timestamp_ns = timespec_to_ns(); ev->what = PROC_EVENT_EXEC; ev->event_data.exec.process_pid = task->pid; ev->event_data.exec.process_tgid = task->tgid; @@ -117,14 +129,14 @@ void proc_id_connector(struct task_struct *task, int which_id) { struct cn_msg *msg; struct proc_event *ev; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); struct timespec ts; const struct cred *cred; if (atomic_read(_event_num_listeners) < 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg->data; memset(>event_data, 0, sizeof(ev->event_data)); ev->what = which_id; @@ -145,7 +157,7 @@ void proc_id_connector(struct task_struct *task, int which_id) rcu_read_unlock(); get_seq(>seq, >cpu); ktime_get_ts(); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(), (__u64 *)>timestamp_ns); + ev->timestamp_ns = timespec_to_ns(); memcpy(>id, _proc_event_id, sizeof(msg->id)); msg->ack = 0; /* not used */ @@ -159,17 +171,17 @@ void proc_sid_connector(struct task_struct *task) struct cn_msg *msg; struct
[PATCH] connector: improved unaligned access error fix
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access bug for ia64 by converting a 64-bit write to use put_unaligned(). Unfortunately, since gcc will convert a short memset() to a series of appropriately-aligned stores, the problem is now visible again on tilegx, where the memset that zeros out proc_event is converted to three 64-bit stores, causing an unaligned access panic. A better fix for the original problem is to ensure that proc_event is aligned to 8 bytes here. We can do that relatively easily by arranging to start the struct cn_msg aligned to 8 bytes and then offset by 4 bytes. Doing so means that the immediately following proc_event structure is then correctly aligned to 8 bytes. The result is that the memset() stores are now aligned, and as an added benefit, we can remove the put_unaligned() calls in the code. Signed-off-by: Chris Metcalf cmetc...@tilera.com --- drivers/connector/cn_proc.c | 72 ++--- 1 file changed, 42 insertions(+), 30 deletions(-) diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c index c73fc2b74de2..18c5b9b16645 100644 --- a/drivers/connector/cn_proc.c +++ b/drivers/connector/cn_proc.c @@ -32,11 +32,23 @@ #include linux/atomic.h #include linux/pid_namespace.h -#include asm/unaligned.h - #include linux/cn_proc.h -#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event)) +/* + * Size of a cn_msg followed by a proc_event structure. Since the + * sizeof struct cn_msg is a multiple of 4 bytes, but not 8 bytes, we + * add one 4-byte word to the size here, and then start the actual + * cn_msg structure 4 bytes into the stack buffer. The result is that + * the immediately following proc_event structure is aligned to 8 bytes. + */ +#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event) + 4) + +/* See comment above; we test our assumption about sizeof struct cn_msg here. */ +static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer) +{ + BUILD_BUG_ON(sizeof(struct cn_msg) != 20); + return (struct cn_msg *)(buffer + 4); +} static atomic_t proc_event_num_listeners = ATOMIC_INIT(0); static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC }; @@ -56,19 +68,19 @@ void proc_fork_connector(struct task_struct *task) { struct cn_msg *msg; struct proc_event *ev; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); struct timespec ts; struct task_struct *parent; if (atomic_read(proc_event_num_listeners) 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg-data; memset(ev-event_data, 0, sizeof(ev-event_data)); get_seq(msg-seq, ev-cpu); ktime_get_ts(ts); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns); + ev-timestamp_ns = timespec_to_ns(ts); ev-what = PROC_EVENT_FORK; rcu_read_lock(); parent = rcu_dereference(task-real_parent); @@ -91,17 +103,17 @@ void proc_exec_connector(struct task_struct *task) struct cn_msg *msg; struct proc_event *ev; struct timespec ts; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); if (atomic_read(proc_event_num_listeners) 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg-data; memset(ev-event_data, 0, sizeof(ev-event_data)); get_seq(msg-seq, ev-cpu); ktime_get_ts(ts); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns); + ev-timestamp_ns = timespec_to_ns(ts); ev-what = PROC_EVENT_EXEC; ev-event_data.exec.process_pid = task-pid; ev-event_data.exec.process_tgid = task-tgid; @@ -117,14 +129,14 @@ void proc_id_connector(struct task_struct *task, int which_id) { struct cn_msg *msg; struct proc_event *ev; - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); struct timespec ts; const struct cred *cred; if (atomic_read(proc_event_num_listeners) 1) return; - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg-data; memset(ev-event_data, 0, sizeof(ev-event_data)); ev-what = which_id; @@ -145,7 +157,7 @@ void proc_id_connector(struct task_struct *task, int which_id) rcu_read_unlock(); get_seq(msg-seq, ev-cpu); ktime_get_ts(ts); /* get high res monotonic timestamp */ - put_unaligned(timespec_to_ns(ts), (__u64 *)ev-timestamp_ns); + ev-timestamp_ns = timespec_to_ns(ts); memcpy(msg-id, cn_proc_event_id, sizeof(msg-id)); msg-ack = 0; /* not
Re: [PATCH] connector: improved unaligned access error fix
On Thu, 14 Nov 2013 12:09:21 -0500 Chris Metcalf cmetc...@tilera.com wrote: - __u8 buffer[CN_PROC_MSG_SIZE]; + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); - msg = (struct cn_msg *)buffer; + msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg-data; memset(ev-event_data, 0, sizeof(ev-event_data)); Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable? -- Pete -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] connector: improved unaligned access error fix
On 11/14/2013 2:45 PM, Pete Zaitcev wrote: On Thu, 14 Nov 2013 12:09:21 -0500 Chris Metcalf cmetc...@tilera.com wrote: -__u8 buffer[CN_PROC_MSG_SIZE]; +__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8); -msg = (struct cn_msg *)buffer; +msg = buffer_to_cn_msg(buffer); ev = (struct proc_event *)msg-data; memset(ev-event_data, 0, sizeof(ev-event_data)); Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable? That would be fine from a correctness point of view; I'm happy either way. My patch nominally has better performance, for what that's worth, since the memset() call is for a smaller range (24 bytes instead of 60). It also avoids the need for put_unaligned(), which even on platforms that allow unaligned stores can still be slower. I can certainly do a v2 with the larger memset() instead if that's the consensus. -- Chris Metcalf, Tilera Corp. http://www.tilera.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] connector: improved unaligned access error fix
From: Chris Metcalf cmetc...@tilera.com Date: Thu, 14 Nov 2013 12:09:21 -0500 In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access bug for ia64 by converting a 64-bit write to use put_unaligned(). Unfortunately, since gcc will convert a short memset() to a series of appropriately-aligned stores, the problem is now visible again on tilegx, where the memset that zeros out proc_event is converted to three 64-bit stores, causing an unaligned access panic. A better fix for the original problem is to ensure that proc_event is aligned to 8 bytes here. We can do that relatively easily by arranging to start the struct cn_msg aligned to 8 bytes and then offset by 4 bytes. Doing so means that the immediately following proc_event structure is then correctly aligned to 8 bytes. The result is that the memset() stores are now aligned, and as an added benefit, we can remove the put_unaligned() calls in the code. Signed-off-by: Chris Metcalf cmetc...@tilera.com This looks fine to me, applied and queued up for -stable, thanks Chris. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/