Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread David Miller
From: Chris Metcalf 
Date: Thu, 14 Nov 2013 12:09:21 -0500

> In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
> bug for ia64 by converting a 64-bit write to use put_unaligned().
> Unfortunately, since gcc will convert a short memset() to a series
> of appropriately-aligned stores, the problem is now visible again
> on tilegx, where the memset that zeros out proc_event is converted
> to three 64-bit stores, causing an unaligned access panic.
> 
> A better fix for the original problem is to ensure that proc_event
> is aligned to 8 bytes here.  We can do that relatively easily by
> arranging to start the struct cn_msg aligned to 8 bytes and then
> offset by 4 bytes.  Doing so means that the immediately following
> proc_event structure is then correctly aligned to 8 bytes.
> 
> The result is that the memset() stores are now aligned, and as an
> added benefit, we can remove the put_unaligned() calls in the code.
> 
> Signed-off-by: Chris Metcalf 

This looks fine to me, applied and queued up for -stable, thanks
Chris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
On 11/14/2013 2:45 PM, Pete Zaitcev wrote:
> On Thu, 14 Nov 2013 12:09:21 -0500
> Chris Metcalf  wrote:
>
>> -__u8 buffer[CN_PROC_MSG_SIZE];
>> +__u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
>> -msg = (struct cn_msg *)buffer;
>> +msg = buffer_to_cn_msg(buffer);
>>  ev = (struct proc_event *)msg->data;
>>  memset(&ev->event_data, 0, sizeof(ev->event_data));
> Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

That would be fine from a correctness point of view; I'm happy
either way.  My patch nominally has better performance, for
what that's worth, since the memset() call is for a smaller
range (24 bytes instead of 60).  It also avoids the need for
put_unaligned(), which even on platforms that allow unaligned
stores can still be slower.

I can certainly do a v2 with the larger memset() instead if
that's the consensus.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Pete Zaitcev
On Thu, 14 Nov 2013 12:09:21 -0500
Chris Metcalf  wrote:

> - __u8 buffer[CN_PROC_MSG_SIZE];
> + __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);

> - msg = (struct cn_msg *)buffer;
> + msg = buffer_to_cn_msg(buffer);
>   ev = (struct proc_event *)msg->data;
>   memset(&ev->event_data, 0, sizeof(ev->event_data));

Why is memset(buffer,0,CN_PROC_MSG_SIZE) not acceptable?

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] connector: improved unaligned access error fix

2013-11-14 Thread Chris Metcalf
In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.

A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here.  We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes.  Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.

The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.

Signed-off-by: Chris Metcalf 
---
 drivers/connector/cn_proc.c | 72 ++---
 1 file changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index c73fc2b74de2..18c5b9b16645 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -32,11 +32,23 @@
 #include 
 #include 
 
-#include 
-
 #include 
 
-#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event))
+/*
+ * Size of a cn_msg followed by a proc_event structure.  Since the
+ * sizeof struct cn_msg is a multiple of 4 bytes, but not 8 bytes, we
+ * add one 4-byte word to the size here, and then start the actual
+ * cn_msg structure 4 bytes into the stack buffer.  The result is that
+ * the immediately following proc_event structure is aligned to 8 bytes.
+ */
+#define CN_PROC_MSG_SIZE (sizeof(struct cn_msg) + sizeof(struct proc_event) + 
4)
+
+/* See comment above; we test our assumption about sizeof struct cn_msg here. 
*/
+static inline struct cn_msg *buffer_to_cn_msg(__u8 *buffer)
+{
+   BUILD_BUG_ON(sizeof(struct cn_msg) != 20);
+   return (struct cn_msg *)(buffer + 4);
+}
 
 static atomic_t proc_event_num_listeners = ATOMIC_INIT(0);
 static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC };
@@ -56,19 +68,19 @@ void proc_fork_connector(struct task_struct *task)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
struct task_struct *parent;
 
if (atomic_read(&proc_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(&ev->event_data, 0, sizeof(ev->event_data));
get_seq(&msg->seq, &ev->cpu);
ktime_get_ts(&ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns(&ts);
ev->what = PROC_EVENT_FORK;
rcu_read_lock();
parent = rcu_dereference(task->real_parent);
@@ -91,17 +103,17 @@ void proc_exec_connector(struct task_struct *task)
struct cn_msg *msg;
struct proc_event *ev;
struct timespec ts;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
 
if (atomic_read(&proc_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(&ev->event_data, 0, sizeof(ev->event_data));
get_seq(&msg->seq, &ev->cpu);
ktime_get_ts(&ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns(&ts);
ev->what = PROC_EVENT_EXEC;
ev->event_data.exec.process_pid = task->pid;
ev->event_data.exec.process_tgid = task->tgid;
@@ -117,14 +129,14 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
 {
struct cn_msg *msg;
struct proc_event *ev;
-   __u8 buffer[CN_PROC_MSG_SIZE];
+   __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);
struct timespec ts;
const struct cred *cred;
 
if (atomic_read(&proc_event_num_listeners) < 1)
return;
 
-   msg = (struct cn_msg *)buffer;
+   msg = buffer_to_cn_msg(buffer);
ev = (struct proc_event *)msg->data;
memset(&ev->event_data, 0, sizeof(ev->event_data));
ev->what = which_id;
@@ -145,7 +157,7 @@ void proc_id_connector(struct task_struct *task, int 
which_id)
rcu_read_unlock();
get_seq(&msg->seq, &ev->cpu);
ktime_get_ts(&ts); /* get high res monotonic timestamp */
-   put_unaligned(timespec_to_ns(&ts), (__u64 *)&ev->timestamp_ns);
+   ev->timestamp_ns = timespec_to_ns(&ts);
 
memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
msg->ack = 0; /* not used */
@@ -159,17 +171,1