Re: [openib-general] more comments on cxgb3

2007-02-08 Thread Steve Wise
On Thu, 2007-02-08 at 08:40 +0200, Michael S. Tsirkin wrote:
 OK, so I looked at cxgb3 some more.

Thanks!

 To summarise my previous comments, I think the cxio hal layer needs to go to
 make the code readable - if I understand correctly it is there for historical
 reasons only.
 

I can do this but its low on the list of todos.


 I started looking at userspace/kernel interaction, and then
 went over to other code under cxgb3 (but not core/).
 
 - Consider a user that does e.g. create QP, but never calls mmap.
   Is there some code that will clean out the unclamed mmap object?
   I couldn't find it, and iwch_dealloc_ucontext does not seem to
   do anything with it.
 

This is a bug.  I've got a fix for it too.  

 - Passing physical address to userspace and back looks suspicios.
   Especially this:
 uresp.physaddr = virt_to_phys(chp-cq.queue);
   Could you elaborate on the design here? What are these phy addresses
   and how come userspace needs to know the phy address?
   You are not doing DMA by this address, by any chance?
 

No, Its not used for DMA by the HW.  The physaddr is passed up to the
user, and the user then mmaps() using that as the offset.

I took this code from the ipath driver.  It has been pointed out to me
that this is broken for 32b userspace on a 64 kernel because mmap2()
cannot pass down 64 bits.  So I need to rework this code.

 - It seems that by passing in huge resource sizes, userspace will be able to
   drink up unlimited amounts of kernel memory.
   mthca handles this by using the mlock rlimit, should something be done here
   as well?

Hmm.  That's a good point.  I'll put this on the todo as well.  So mthca
adds to process's rlimit value as things are allocated out of kernel
memory (or maybe even anything that gets pinned)?

  
 A couple of comments on PDBG macro.
 - I'd like to suggest following the practice of prefixing macro names with 
 module name
   (same goes for functions like get_mhp really) - unless they are local to 
 file.
 
 - You are using __FUNCTION__ a lot - it might be to just to kill it,
   messages are unique so you'll be able to locate the msg source anyway,
   save some kernel text and logs will be shorter. In any case I think
   __func__ is the recommended gcc way to get the name currently.
 
 - comment near pr_debug definition in include/linux/kernel.h says:
   /* If you are writing a driver, please use dev_dbg instead */
   so it might be a good idea for PDBG to follow this rule.
 
 - log messages do not look very informative to me.
   I also think they are a bit too many of them currently.
   For example, I do not think it is a good idea to print
   the kernel pointers out.
 
   For example, in code like the following:
   mhp = get_mhp(rhp, (sg_list[i].lkey)  8);
   if (!mhp) {
   PDBG(%s %d\n, __FUNCTION__, __LINE__);
   return -EIO;
   }
 
   might be better to say 
   MR key XXX does not exist. Exiting..
   -EIO also looks like a strange error code to return here, does it not?
   Maybe something like EINVAL would be more appropriate?
 

I'll take a todo to clean up the debug stuff. 

 - I wonder about the names like get_mhp - what does mhp mean?
 static inline struct iwch_mr *get_mhp(struct iwch_dev *rhp, u32 mmid)
 {
 return idr_find(rhp-mmidr, mmid);
 }
 

Memory Handle Pointer.


 Looks like it looks up an mr. Is that right? Maybe the name shouldbe changed
 to convey this meaning.
 
 - In the following code, what does missing pdid check mean?
 /*
  * TBD: this is going to be moved to firmware. Missing pdid/qpid check for 
 now.
  */
 This sounds interesting.
 Does this mean the code does not validate the PD currently?
 

I need firmware support for this.  It will be done asap and I can remove
this code entirely.

 I have the same question for:
 /* TBD: check perms */
 in iwch_bind_mw.
 
 BTW, does TBD stand for To Be Done here?

Yes.

 Do you mean TODO really?
 
 - iwch_sgl2pbl_map is used in several places, and seems a bit too big to be 
 inline
 
 Well, it's time to go do my day job now :)
 
 Hope this helps,
 

Thanks again Michael!

Steve.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] more comments on cxgb3

2007-02-08 Thread Michael S. Tsirkin
  - It seems that by passing in huge resource sizes, userspace will be able to
drink up unlimited amounts of kernel memory.
mthca handles this by using the mlock rlimit, should something be done 
  here
as well?
 
 Hmm.  That's a good point.  I'll put this on the todo as well.  So mthca
 adds to process's rlimit value as things are allocated out of kernel
 memory (or maybe even anything that gets pinned)?

Yes. The code is actually in uverbs core, mthca uses that.

  - I wonder about the names like get_mhp - what does mhp mean?
  static inline struct iwch_mr *get_mhp(struct iwch_dev *rhp, u32 mmid)
  {
  return idr_find(rhp-mmidr, mmid);
  }
  
 
 Memory Handle Pointer.

hmm, what's a Handle? Maybe a better name can be found.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] more comments on cxgb3

2007-02-08 Thread Steve Wise
 - Consider a user that does e.g. create QP, but never calls mmap.
   Is there some code that will clean out the unclamed mmap object?
   I couldn't find it, and iwch_dealloc_ucontext does not seem to
   do anything with it.

BTW: Here is my fix for this.

-

Clean up pending mmaps on ucontext deallocation.

From: Steve Wise [EMAIL PROTECTED]

Free all pending mmap structs when the ucontext is deallocated.

Signed-off-by: Steve Wise [EMAIL PROTECTED]
---

 drivers/infiniband/hw/cxgb3/iwch_provider.c |1 +
 drivers/infiniband/hw/cxgb3/iwch_provider.h |   15 +++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index db2b0a8..98568ee 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -99,6 +99,7 @@ static int iwch_dealloc_ucontext(struct 
struct iwch_dev *rhp = to_iwch_dev(context-device);
struct iwch_ucontext *ucontext = to_iwch_ucontext(context);
PDBG(%s context %p\n, __FUNCTION__, context);
+   free_mmaps(ucontext);
cxio_release_ucontext(rhp-rdev, ucontext-uctx);
kfree(ucontext);
return 0;
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h 
b/drivers/infiniband/hw/cxgb3/iwch_provider.h
index 1ede8a7..c8c07ee 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.h
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h
@@ -199,6 +199,21 @@ struct iwch_mm_entry {
unsigned len;
 };
 
+static inline void free_mmaps(struct iwch_ucontext *ucontext)
+{
+   struct list_head *pos, *nxt;
+   struct iwch_mm_entry *mm;
+
+   spin_lock(ucontext-mmap_lock);
+   list_for_each_safe(pos, nxt, ucontext-mmaps) {
+   mm = list_entry(pos, struct iwch_mm_entry, entry);
+   list_del(mm-entry);
+   kfree(mm);
+   }
+   spin_unlock(ucontext-mmap_lock);
+   return;
+}
+
 static inline struct iwch_mm_entry *remove_mmap(struct iwch_ucontext *ucontext,
u64 addr, unsigned len)
 {



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] more comments on cxgb3

2007-02-08 Thread Roland Dreier
  diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
  b/drivers/infiniband/hw/cxgb3/iwch_provider.c
  index db2b0a8..98568ee 100644
  --- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
  +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
  @@ -99,6 +99,7 @@ static int iwch_dealloc_ucontext(struct 
   struct iwch_dev *rhp = to_iwch_dev(context-device);
   struct iwch_ucontext *ucontext = to_iwch_ucontext(context);
   PDBG(%s context %p\n, __FUNCTION__, context);
  +free_mmaps(ucontext);
   cxio_release_ucontext(rhp-rdev, ucontext-uctx);
   kfree(ucontext);
   return 0;
  diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.h 
  b/drivers/infiniband/hw/cxgb3/iwch_provider.h
  index 1ede8a7..c8c07ee 100644
  --- a/drivers/infiniband/hw/cxgb3/iwch_provider.h
  +++ b/drivers/infiniband/hw/cxgb3/iwch_provider.h
  @@ -199,6 +199,21 @@ struct iwch_mm_entry {
   unsigned len;
   };
   
  +static inline void free_mmaps(struct iwch_ucontext *ucontext)
  +{
  +struct list_head *pos, *nxt;
  +struct iwch_mm_entry *mm;
  +
  +spin_lock(ucontext-mmap_lock);
  +list_for_each_safe(pos, nxt, ucontext-mmaps) {
  +mm = list_entry(pos, struct iwch_mm_entry, entry);
  +list_del(mm-entry);
  +kfree(mm);
  +}
  +spin_unlock(ucontext-mmap_lock);
  +return;
  +}

Since you only have one caller, I would suggest just open-coding the
deletion at the call-site (since that function is really too big to
inline if it ever grows another caller).  And I don't think you need
the locking either, since there better be no one else looking at the
context structure while you're in the process of freeing it.

Something like:

struct iwch_dev *rhp = to_iwch_dev(context-device);
struct iwch_ucontext *ucontext = to_iwch_ucontext(context);
struct iwch_mm_entry *mm, *tmp;

PDBG(%s context %p\n, __FUNCTION__, context);
list_for_each_entry_safe(mm, tmp, ucontext-mmaps)
kfree(mm);
cxio_release_ucontext(rhp-rdev, ucontext-uctx);
kfree(ucontext);
return 0;

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] more comments on cxgb3

2007-02-07 Thread Michael S. Tsirkin
OK, so I looked at cxgb3 some more.
To summarise my previous comments, I think the cxio hal layer needs to go to
make the code readable - if I understand correctly it is there for historical
reasons only.

I started looking at userspace/kernel interaction, and then
went over to other code under cxgb3 (but not core/).

- Consider a user that does e.g. create QP, but never calls mmap.
  Is there some code that will clean out the unclamed mmap object?
  I couldn't find it, and iwch_dealloc_ucontext does not seem to
  do anything with it.

- Passing physical address to userspace and back looks suspicios.
  Especially this:
uresp.physaddr = virt_to_phys(chp-cq.queue);
  Could you elaborate on the design here? What are these phy addresses
  and how come userspace needs to know the phy address?
  You are not doing DMA by this address, by any chance?

- It seems that by passing in huge resource sizes, userspace will be able to
  drink up unlimited amounts of kernel memory.
  mthca handles this by using the mlock rlimit, should something be done here
  as well?
 
A couple of comments on PDBG macro.
- I'd like to suggest following the practice of prefixing macro names with 
module name
  (same goes for functions like get_mhp really) - unless they are local to file.

- You are using __FUNCTION__ a lot - it might be to just to kill it,
  messages are unique so you'll be able to locate the msg source anyway,
  save some kernel text and logs will be shorter. In any case I think
  __func__ is the recommended gcc way to get the name currently.

- comment near pr_debug definition in include/linux/kernel.h says:
/* If you are writing a driver, please use dev_dbg instead */
  so it might be a good idea for PDBG to follow this rule.

- log messages do not look very informative to me.
  I also think they are a bit too many of them currently.
  For example, I do not think it is a good idea to print
  the kernel pointers out.

  For example, in code like the following:
mhp = get_mhp(rhp, (sg_list[i].lkey)  8);
if (!mhp) {
PDBG(%s %d\n, __FUNCTION__, __LINE__);
return -EIO;
}

  might be better to say 
  MR key XXX does not exist. Exiting..
  -EIO also looks like a strange error code to return here, does it not?
  Maybe something like EINVAL would be more appropriate?

- I wonder about the names like get_mhp - what does mhp mean?
static inline struct iwch_mr *get_mhp(struct iwch_dev *rhp, u32 mmid)
{
return idr_find(rhp-mmidr, mmid);
}

Looks like it looks up an mr. Is that right? Maybe the name shouldbe changed
to convey this meaning.

- In the following code, what does missing pdid check mean?
/*
 * TBD: this is going to be moved to firmware. Missing pdid/qpid check for now.
 */
This sounds interesting.
Does this mean the code does not validate the PD currently?

I have the same question for:
/* TBD: check perms */
in iwch_bind_mw.

BTW, does TBD stand for To Be Done here?
google says:
Definitions of TBD on the Web:

* To Be Determined, Defined, Decided.
  www.csr.com/ptot.htm

* to be determined
  
www.liberalsagainstterrorism.com/wiki/index.php/Counterinsurgency_Operations/Glossary

* Treasury Board (Secretariat)
  www.psc-cfp.gc.ca/centres/definitions_and_notes_e.htm

* The three letter abbreviation TBD may be/mean, depending on context: * an 
 acronym for To Be Determined (...at a later point in time., typically)* 
the Douglas Devastator, a US Navy torpedo bomber of World War II
  en.wikipedia.org/wiki/TBD

What is to be determined here?
Do you mean TODO really?

- iwch_sgl2pbl_map is used in several places, and seems a bit too big to be 
inline

Well, it's time to go do my day job now :)

Hope this helps,

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general