Re: [PATCH v2 4/7] AMSO1100 Memory Management.

2006-06-16 Thread Nick Piggin

Tom Tucker wrote:

On Thu, 2006-06-08 at 01:17 -0700, Andrew Morton wrote:


On Wed, 07 Jun 2006 15:06:55 -0500
Steve Wise [EMAIL PROTECTED] wrote:



+void c2_free(struct c2_alloc *alloc, u32 obj)
+{
+   spin_lock(alloc-lock);
+   clear_bit(obj, alloc-table);
+   spin_unlock(alloc-lock);
+}


The spinlock is unneeded here.



Good point.


Really?

clear_bit does not give you any memory ordering, so you can have
the situation where another CPU sees the bit cleared, but this
CPU still has stores pending to whatever it is being freed. Or
any number of other nasty memory ordering badness.

I'd just use the spinlocks, and prepend the clear_bit with a
double underscore (so you get the non-atomic version), if that
is appropriate.

The spinlocks nicely handle all the memory ordering issues, and
serve to document concurrency issues. If you need every last bit
of performance and scalability, that's OK, but you need comments
and I suspect you'd need more memory barriers.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/7] AMSO1100 Memory Management.

2006-06-12 Thread Tom Tucker
On Thu, 2006-06-08 at 01:17 -0700, Andrew Morton wrote:
 On Wed, 07 Jun 2006 15:06:55 -0500
 Steve Wise [EMAIL PROTECTED] wrote:
 
  
  +void c2_free(struct c2_alloc *alloc, u32 obj)
  +{
  +   spin_lock(alloc-lock);
  +   clear_bit(obj, alloc-table);
  +   spin_unlock(alloc-lock);
  +}
 
 The spinlock is unneeded here.

Good point.

 
 
 What does all the code in this file do, anyway?  It looks totally generic
 (and hence inappropriate for drivers/infiniband/hw/amso1100/) and somewhat
 similar to idr trees, perhaps.
 

We mimicked the mthca driver. It may be code that should be replaced
with Linux core services for new drivers. We'll investigate.

  +int c2_array_set(struct c2_array *array, int index, void *value)
  +{
  +   int p = (index * sizeof(void *))  PAGE_SHIFT;
  +
  +   /* Allocate with GFP_ATOMIC because we'll be called with locks held. */
  +   if (!array-page_list[p].page)
  +   array-page_list[p].page =
  +   (void **) get_zeroed_page(GFP_ATOMIC);
  +
  +   if (!array-page_list[p].page)
  +   return -ENOMEM;
 
 This _will_ happen under load.  What will the result of that be, in the
 context of thise driver?

A higher level object allocation will fail. In this case, a kernel
application request will fail and the application must handle the error.
 
 This function is incorrectly designed - it should receive a gfp_t argument.
 Because you don't *know* that the caller will always hold a spinlock.  And
 GFP_KERNEL is far, far stronger than GFP_ATOMIC.

This service is allocating a page that the adapter will DMA 2B message
indices into. 
 
  +static int c2_alloc_mqsp_chunk(gfp_t gfp_mask, struct sp_chunk **head)
  +{
  +   int i;
  +   struct sp_chunk *new_head;
  +
  +   new_head = (struct sp_chunk *) __get_free_page(gfp_mask | GFP_DMA);
 
 Why is __GFP_DMA in there?  Unless you've cornered the ISA bus infiniband
 market, it's likely to be wrong.
 

Flag confusion about what GFP_DMA means. We'll revisit this whole
file ... 

 
 -
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/7] AMSO1100 Memory Management.

2006-06-08 Thread Andrew Morton
On Wed, 07 Jun 2006 15:06:55 -0500
Steve Wise [EMAIL PROTECTED] wrote:

 
 +void c2_free(struct c2_alloc *alloc, u32 obj)
 +{
 + spin_lock(alloc-lock);
 + clear_bit(obj, alloc-table);
 + spin_unlock(alloc-lock);
 +}

The spinlock is unneeded here.


What does all the code in this file do, anyway?  It looks totally generic
(and hence inappropriate for drivers/infiniband/hw/amso1100/) and somewhat
similar to idr trees, perhaps.

 +int c2_array_set(struct c2_array *array, int index, void *value)
 +{
 + int p = (index * sizeof(void *))  PAGE_SHIFT;
 +
 + /* Allocate with GFP_ATOMIC because we'll be called with locks held. */
 + if (!array-page_list[p].page)
 + array-page_list[p].page =
 + (void **) get_zeroed_page(GFP_ATOMIC);
 +
 + if (!array-page_list[p].page)
 + return -ENOMEM;

This _will_ happen under load.  What will the result of that be, in the
context of thise driver?

This function is incorrectly designed - it should receive a gfp_t argument.
Because you don't *know* that the caller will always hold a spinlock.  And
GFP_KERNEL is far, far stronger than GFP_ATOMIC.

 +static int c2_alloc_mqsp_chunk(gfp_t gfp_mask, struct sp_chunk **head)
 +{
 + int i;
 + struct sp_chunk *new_head;
 +
 + new_head = (struct sp_chunk *) __get_free_page(gfp_mask | GFP_DMA);

Why is __GFP_DMA in there?  Unless you've cornered the ISA bus infiniband
market, it's likely to be wrong.


-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/7] AMSO1100 Memory Management.

2006-06-07 Thread Steve Wise

Review Changes:

- sizeof - sizeof()
---

 drivers/infiniband/hw/amso1100/c2_alloc.c |  256 
 drivers/infiniband/hw/amso1100/c2_mm.c|  378 +
 2 files changed, 634 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/amso1100/c2_alloc.c 
b/drivers/infiniband/hw/amso1100/c2_alloc.c
new file mode 100644
index 000..e496eb7
--- /dev/null
+++ b/drivers/infiniband/hw/amso1100/c2_alloc.c
@@ -0,0 +1,256 @@
+/*
+ * Copyright (c) 2004 Topspin Communications.  All rights reserved.
+ * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include linux/errno.h
+#include linux/slab.h
+#include linux/bitmap.h
+
+#include c2.h
+
+/* Trivial bitmap-based allocator */
+u32 c2_alloc(struct c2_alloc *alloc)
+{
+   u32 obj;
+
+   spin_lock(alloc-lock);
+   obj = find_next_zero_bit(alloc-table, alloc-max, alloc-last);
+   if (obj = alloc-max)
+   obj = find_first_zero_bit(alloc-table, alloc-max);
+   if (obj = 0) {
+   alloc-last = obj+1;
+   if (alloc-last  alloc-max)
+   alloc-last = 0;
+   }
+   spin_unlock(alloc-lock);
+
+   return obj;
+}
+
+void c2_free(struct c2_alloc *alloc, u32 obj)
+{
+   spin_lock(alloc-lock);
+   clear_bit(obj, alloc-table);
+   spin_unlock(alloc-lock);
+}
+
+int c2_alloc_init(struct c2_alloc *alloc, u32 num, u32 reserved)
+{
+   int i;
+
+   alloc-last = 0;
+   alloc-max = num;
+   spin_lock_init(alloc-lock);
+   alloc-table = kmalloc(BITS_TO_LONGS(num) * sizeof(long), GFP_KERNEL);
+   if (!alloc-table)
+   return -ENOMEM;
+
+   bitmap_zero(alloc-table, num);
+   for (i = 0; i  reserved; ++i)
+   set_bit(i, alloc-table);
+
+   return 0;
+}
+
+void c2_alloc_cleanup(struct c2_alloc *alloc)
+{
+   kfree(alloc-table);
+}
+
+/*
+ * Array of pointers with lazy allocation of leaf pages.  Callers of
+ * _get, _set and _clear methods must use a lock or otherwise
+ * serialize access to the array.
+ */
+
+void *c2_array_get(struct c2_array *array, int index)
+{
+   int p = (index * sizeof(void *))  PAGE_SHIFT;
+
+   if (array-page_list[p].page) {
+   int i = index  (PAGE_SIZE / sizeof(void *) - 1);
+   return array-page_list[p].page[i];
+   } else
+   return NULL;
+}
+
+int c2_array_set(struct c2_array *array, int index, void *value)
+{
+   int p = (index * sizeof(void *))  PAGE_SHIFT;
+
+   /* Allocate with GFP_ATOMIC because we'll be called with locks held. */
+   if (!array-page_list[p].page)
+   array-page_list[p].page =
+   (void **) get_zeroed_page(GFP_ATOMIC);
+
+   if (!array-page_list[p].page)
+   return -ENOMEM;
+
+   array-page_list[p].page[index  (PAGE_SIZE / sizeof(void *) - 1)] =
+   value;
+   ++array-page_list[p].used;
+
+   return 0;
+}
+
+void c2_array_clear(struct c2_array *array, int index)
+{
+   int p = (index * sizeof(void *))  PAGE_SHIFT;
+
+   if (--array-page_list[p].used == 0) {
+   free_page((unsigned long) array-page_list[p].page);
+   array-page_list[p].page = NULL;
+   }
+
+   if (array-page_list[p].used  0)
+   pr_debug(Array %p index %d page %d with ref count %d  0\n,
+array, index, p, array-page_list[p].used);
+}
+
+int c2_array_init(struct c2_array *array, int nent)
+{
+   int npage = (nent * sizeof(void *) + PAGE_SIZE - 1) / PAGE_SIZE;
+   int i;
+
+