Re: [openib-general] [PATCH] IB/uverbs: Fix lockdep warning when QP is created with 2 CQs

2006-08-16 Thread Arjan van de Ven
On Wed, 2006-08-16 at 09:43 -0700, Roland Dreier wrote:
 Arjan, here's a case that disproves your rule of thumb that rwsems
 are equivalent to mutexes as far as correctness goes: you can't have
 an AB-BA deadlock with nested rwsems when using down_read().  In other
 words, the following:
 
   down_read(lock_1);
   down_read(lock_2);
   down_read(lock_2);
   down_read(lock_1);
 
 is perfectly safe

it's safe as long as you never ever do a down_write nested inside or
outside a down_read of any of these locks

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Fix potential deadlock in mthca

2006-08-11 Thread Arjan van de Ven
Roland Dreier wrote:
 Here's a long-standing bug that lockdep found very nicely.
 
 Ingo/Arjan, can you confirm that the fix looks OK and I am using
 spin_lock_nested() properly?  I couldn't find much documentation or
 many examples of it, so I'm not positive this is the right way to
 handle this fix.
 

looks correct to me;

Acked-by: Arjan van de Ven [EMAIL PROTECTED]

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] lockdep: don't pull in includes when lockdep disabled

2006-07-26 Thread Arjan van de Ven
On Wed, 2006-07-26 at 09:26 +0300, Michael S. Tsirkin wrote:
 Ingo, does the following look good to you?
 
 Do not pull in various includes through lockdep.h if lockdep is disabled.

Hi,

can you tell us what this fixes? Eg is there a specific problem?
I mean... we're adding ifdefs so there better be a real good reason for
them fixing something real would be such a reason ;-)

Greetings,
Arjan van de Ven


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Convert idr's internal locking to _irqsave variant

2006-07-14 Thread Arjan van de Ven
On Thu, 2006-07-13 at 17:18 -0700, Roland Dreier wrote:
 Arjan it does get harder if this is needed for your IB device to
 Arjan do more work, so that your swap device on your IB can take
 Arjan more IO's to free up ram..
 
 That's the classic problem, but it's more a matter of the consumer
 using GFP_NOIO in the right places.

GFP_NOIO isn't going to save you in the cases where the memory really is
running low and you need the memory to do more IO...



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] Convert idr's internal locking to _irqsave variant

2006-07-13 Thread Arjan van de Ven
On Thu, 2006-07-13 at 14:03 -0700, Roland Dreier wrote:
   I suspect it'll get really ugly.  It's a container library which needs to
   allocate memory when items are added, like the radix-tree.  Either it needs
   to assume GFP_ATOMIC, which is bad and can easily fail or it does weird
   things like radix_tree_preload().
 
 Actually I don't think it has to be too bad.  We could tweak the
 interface a little bit so that consumers do something like:
 
   

it does get harder if this is needed for your IB device to do 
more work, so that your swap device on your IB can take more IO's
to free up ram..



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH v3 1/7] AMSO1100 Low Level Driver.

2006-06-23 Thread Arjan van de Ven

   + /* Tell HW to xmit */
   + __raw_writeq(cpu_to_be64(mapaddr), elem-hw_desc + C2_TXP_ADDR);
   + __raw_writew(cpu_to_be16(maplen), elem-hw_desc + C2_TXP_LEN);
   + __raw_writew(cpu_to_be16(TXP_HTXD_READY), elem-hw_desc + C2_TXP_FLAGS);
  
  or here
  
 
 No need here.  This logic submits the packet for transmission.  We don't
 assume it is transmitted until we (after a completion interrupt usually)
 read back the HTXD entry and see the TXP_HTXD_DONE bit set (see
 c2_tx_interrupt()). 

... but will that interrupt happen at all if these 3 writes never hit
the hardware?



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH v3 1/7] AMSO1100 Low Level Driver.

2006-06-23 Thread Arjan van de Ven
On Fri, 2006-06-23 at 08:56 -0500, Steve Wise wrote:
 On Fri, 2006-06-23 at 15:48 +0200, Arjan van de Ven wrote:
 + /* Tell HW to xmit */
 + __raw_writeq(cpu_to_be64(mapaddr), elem-hw_desc + C2_TXP_ADDR);
 + __raw_writew(cpu_to_be16(maplen), elem-hw_desc + C2_TXP_LEN);
 + __raw_writew(cpu_to_be16(TXP_HTXD_READY), elem-hw_desc + 
 C2_TXP_FLAGS);

or here

   
   No need here.  This logic submits the packet for transmission.  We don't
   assume it is transmitted until we (after a completion interrupt usually)
   read back the HTXD entry and see the TXP_HTXD_DONE bit set (see
   c2_tx_interrupt()). 
  
  ... but will that interrupt happen at all if these 3 writes never hit
  the hardware?
  
 
 I thought the posted write WILL eventually get to adapter memory.  Not
 stall forever cached in a bridge.  I'm wrong?

I'm not sure there is a theoretical upper bound 

(and if it's several msec per bridge, then you have a lot of latency
anyway)


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH v3 1/7] AMSO1100 Low Level Driver.

2006-06-20 Thread Arjan van de Ven
On Tue, 2006-06-20 at 15:30 -0500, Steve Wise wrote:

 +/*
 + * Allocate TX ring elements and chain them together.
 + * One-to-one association of adapter descriptors with ring elements.
 + */
 +static int c2_tx_ring_alloc(struct c2_ring *tx_ring, void *vaddr,
 + dma_addr_t base, void __iomem * mmio_txp_ring)
 +{
 + struct c2_tx_desc *tx_desc;
 + struct c2_txp_desc __iomem *txp_desc;
 + struct c2_element *elem;
 + int i;
 +
 + tx_ring-start = kmalloc(sizeof(*elem) * tx_ring-count, GFP_KERNEL);

I would think this needs a dma_alloc_coherent() rather than a kmalloc...


 +
 +/* Free all buffers in RX ring, assumes receiver stopped */
 +static void c2_rx_clean(struct c2_port *c2_port)
 +{
 + struct c2_dev *c2dev = c2_port-c2dev;
 + struct c2_ring *rx_ring = c2_port-rx_ring;
 + struct c2_element *elem;
 + struct c2_rx_desc *rx_desc;
 +
 + elem = rx_ring-start;
 + do {
 + rx_desc = elem-ht_desc;
 + rx_desc-len = 0;
 +
 + __raw_writew(0, elem-hw_desc + C2_RXP_STATUS);
 + __raw_writew(0, elem-hw_desc + C2_RXP_COUNT);
 + __raw_writew(0, elem-hw_desc + C2_RXP_LEN);

you seem to be a fan of the __raw_write() functions... any reason why?
__raw_ is not a magic go faster prefix

Also on a related note, have you checked the driver for the needed PCI
posting flushes?

 +
 + /* Disable IRQs by clearing the interrupt mask */
 + writel(1, c2dev-regs + C2_IDIS);
 + writel(0, c2dev-regs + C2_NIMR0);

like here...
 +
 + elem = tx_ring-to_use;
 + elem-skb = skb;
 + elem-mapaddr = mapaddr;
 + elem-maplen = maplen;
 +
 + /* Tell HW to xmit */
 + __raw_writeq(cpu_to_be64(mapaddr), elem-hw_desc + C2_TXP_ADDR);
 + __raw_writew(cpu_to_be16(maplen), elem-hw_desc + C2_TXP_LEN);
 + __raw_writew(cpu_to_be16(TXP_HTXD_READY), elem-hw_desc + C2_TXP_FLAGS);

or here

 +static int c2_change_mtu(struct net_device *netdev, int new_mtu)
 +{
 + int ret = 0;
 +
 + if (new_mtu  ETH_ZLEN || new_mtu  ETH_JUMBO_MTU)
 + return -EINVAL;
 +
 + netdev-mtu = new_mtu;
 +
 + if (netif_running(netdev)) {
 + c2_down(netdev);
 +
 + c2_up(netdev);
 + }

this looks odd...




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH 5 of 13] ipath - use proper address translation routine

2006-05-01 Thread Arjan van de Ven
On Mon, 2006-05-01 at 11:50 -0700, Roland Dreier wrote:
 Bryan Move away from an obsolete, unportable routine for
 Bryan translating physical addresses.
 
 This change:
 
   -  isge-vaddr = bus_to_virt(sge-addr);
   +  isge-vaddr = phys_to_virt(sge-addr);
 
 is really wrong.  bus_to_virt() is really what you want, because in
 this case the address is a bus address that came from dma_map_xxx().
 You're still going to be hosed on systems with IOMMUs for example.

do you really NEED the vaddr? 
(most of the time linux drivers don't need it, while other OSes do)
If you really need it you should grab it at dma_map time ...
(and realize that it's not kernel addressable per se ;)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 5 of 13] ipath - use proper address translation routine

2006-05-01 Thread Arjan van de Ven
On Mon, 2006-05-01 at 12:00 -0700, Roland Dreier wrote:
 Arjan do you really NEED the vaddr?  (most of the time linux
 Arjan drivers don't need it, while other OSes do) If you really
 Arjan need it you should grab it at dma_map time ...  (and
 Arjan realize that it's not kernel addressable per se ;)
 
 Yes, they need some kind of vaddr.
 
 It's kind of a layering problem.  The IB stack assumes that IB devices
 have a DMA engine that deals with bus addresses.  But the ipath driver
 has to simulate this by using a memcpy on the CPU to move data to the
 PCI device.
 
 I really don't know what the right solution is.  Maybe having some way
 to override the dma mapping operations so that the ipath driver can
 keep the info it needs?

sounds like you need to redesign your layering ;)
In linux it's common to have the lowest level driver do the mapping
(even when the mid layer will provide the most commonly used helper to
do it for the common case)...


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: TSO and IPoIB performance degradation

2006-03-20 Thread Arjan van de Ven
On Mon, 2006-03-20 at 13:27 +0200, Michael S. Tsirkin wrote:
 Quoting David S. Miller [EMAIL PROTECTED]:
  I disagree with Linux changing it's behavior.  It would be great to
  turn off congestion control completely over local gigabit networks,
  but that isn't determinable in any way, so we don't do that.
 
 Interesting. Would it make sense to make it another tunable knob in
 /proc, sysfs or sysctl then?

that's not the right level; since that is per interface. And you only
know the actual interface waay too late (as per earlier posts).
Per socket.. maybe
But then again it's not impossible to have packets for one socket go out
to multiple interfaces
(think load balancing bonding over 2 interfaces, one IB another
ethernet)


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: TSO and IPoIB performance degradation

2006-03-20 Thread Arjan van de Ven
On Mon, 2006-03-20 at 12:49 +0100, Lennert Buytenhek wrote:
 On Mon, Mar 20, 2006 at 12:47:03PM +0100, Arjan van de Ven wrote:
 
I disagree with Linux changing it's behavior.  It would be great to
turn off congestion control completely over local gigabit networks,
but that isn't determinable in any way, so we don't do that.
   
   Interesting. Would it make sense to make it another tunable knob in
   /proc, sysfs or sysctl then?
  
  that's not the right level; since that is per interface. And you only
  know the actual interface waay too late (as per earlier posts).
  Per socket.. maybe
  But then again it's not impossible to have packets for one socket go out
  to multiple interfaces
  (think load balancing bonding over 2 interfaces, one IB another
  ethernet)
 
 I read it as if he was proposing to have a sysctl knob to turn off
 TCP congestion control completely (which has so many issues it's not
 even funny.)

owww that's so bad I didn't even consider that

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: 2.6.16-rc6-mm2: new RDMA CM EXPORT_SYMBOL's

2006-03-20 Thread Arjan van de Ven

 Please explain the thinking behind the choice of a non-GPL export. 
 (Yes, we discussed this when inifiniband was first merged, but it
 doesn't hurt to reiterate).
 
 The agreement made within the OpenIB community, from where this code 
 originates, 
 is that all source code be licensed under a dual license of BSD/GPL.  I am 
 not a 
 lawyer, so I don't know the implications of changing the exports to be GPL 
 only, 
 given the OpenIB license.  But my understanding is that makes using those 
 functions less attractive.

no actually ;)

but I understood OpenIB to be GPL when used in the linux kernel,
optionally BSD when outside linux. Since EXPORT_SYMBOL is highly linux
kernel specific... the _GPL should be just fine

 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Revenge of the sysfs maintainer! (was Re: [PATCH 8 of 20] ipath - sysfs support for core driver)

2006-03-10 Thread Arjan van de Ven
On Fri, 2006-03-10 at 05:51 -0800, Bryan O'Sullivan wrote:
 On Fri, 2006-03-10 at 08:57 +0100, Arjan van de Ven wrote:
  On Thu, 2006-03-09 at 20:58 -0800, Bryan O'Sullivan wrote:
   On Thu, 2006-03-09 at 17:00 -0800, Greg KH wrote:
   
They are in the latest -mm tree if you wish to use them.  Unfortunatly
it might look like they will not work out, due to the per-cpu relay
files not working properly with Paul's patches at the moment.
   
   Hmm, OK.
   
What's wrong with debugfs?
   
   It's not configured into the kernels of either of the distros I use (Red
   Hat or SUSE).  I can't have a required part of my driver depend on a
   feature that's not enabled in the major distro kernels.
  
  sucks to be you, however I think it's equally or even more unacceptable
  to cripple the main kernel because you want to also support antique
  kernels (those more than 12 months old).
 
 What antique kernels?  It's not enabled in the latest SLES beta
 (2.6.16-git6 or so), or in Fedora rawhide (also 2.6.16-git).
 
 They mightn't be exactly today's kernels, but they're no more than two
 or three weeks old.  CONFIG_DEBUG_FS has been in the kernel for a long
 time, and it's still not being picked up.

but it's a module; you can ship it no problem yourself if you go through
the hell of shipping external modules



 
   The general rule is if you
  want to support that, do it outside the kernel.org tree.
 
 Which that are you referring to?

supporting really ancient kernels


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Revenge of the sysfs maintainer! (was Re: [PATCH 8 of 20] ipath - sysfs support for core driver)

2006-03-10 Thread Arjan van de Ven
On Fri, 2006-03-10 at 07:55 -0800, Bryan O'Sullivan wrote:
 On Fri, 2006-03-10 at 15:06 +0100, Arjan van de Ven wrote:
 
   They mightn't be exactly today's kernels, but they're no more than two
   or three weeks old.  CONFIG_DEBUG_FS has been in the kernel for a long
   time, and it's still not being picked up.
  
  but it's a module; you can ship it no problem yourself if you go through
  the hell of shipping external modules
 
 Yes, we can ship it ourselves.  But a significant point of this exercise
 is to have the drivers be available to people who use unmodified
 distros, and don't want to download other bits to do so.

well shipping one .ko or two .ko's... same difference. Or maybe in your
case it's a 4-5 transition. No Difference from a user point of view.


So sorry I'm not buying your argument very much.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Revenge of the sysfs maintainer! (was Re: [PATCH 8 of 20] ipath - sysfs support for core driver)

2006-03-10 Thread Arjan van de Ven
On Fri, 2006-03-10 at 08:36 -0800, Bryan O'Sullivan wrote:
 On Fri, 2006-03-10 at 17:24 +0100, Arjan van de Ven wrote:
 
  well shipping one .ko or two .ko's... same difference. Or maybe in your
  case it's a 4-5 transition. No Difference from a user point of view.
 
 I think you misunderstand.  The goal is for *us* to ship *none* :-)

then it gets easier.. KConfig lets you select debugfs. Done. 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Revenge of the sysfs maintainer! (was Re: [PATCH 8 of 20] ipath - sysfs support for core driver)

2006-03-09 Thread Arjan van de Ven
On Thu, 2006-03-09 at 20:58 -0800, Bryan O'Sullivan wrote:
 On Thu, 2006-03-09 at 17:00 -0800, Greg KH wrote:
 
  They are in the latest -mm tree if you wish to use them.  Unfortunatly
  it might look like they will not work out, due to the per-cpu relay
  files not working properly with Paul's patches at the moment.
 
 Hmm, OK.
 
  What's wrong with debugfs?
 
 It's not configured into the kernels of either of the distros I use (Red
 Hat or SUSE).  I can't have a required part of my driver depend on a
 feature that's not enabled in the major distro kernels.

sucks to be you, however I think it's equally or even more unacceptable
to cripple the main kernel because you want to also support antique
kernels (those more than 12 months old). The general rule is if you
want to support that, do it outside the kernel.org tree.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 02/22] Firmware interface code for IB device.

2006-02-18 Thread Arjan van de Ven
On Sat, 2006-02-18 at 14:26 +0200, Muli Ben-Yehuda wrote:
 On Sat, Feb 18, 2006 at 12:20:11PM +, Christoph Hellwig wrote:
 
   Well, the eHCA guys tell me that they can't post patches to lkml.
  
  Then they lie.  And not posting to lkml is a good reason not to merge
  an otherwise perfect driver.  (which this one is far from)
 
 I don't speak for IBM or the authors, but there are perfectly
 reasonable reasons to ask someone else to post a patch on your behalf
 - including but not limited to to only being able to use Lotus Notes
 with one's IBM email. I'm sure you've all seen the travesties that
 Notes inflicts on inline patches.

there are ways around that with webmail etc.

The bigger issue is: if people can't be bothered to do those steps, why
would they be bothered to do this for maintenance and bugfixes etc etc?
Basically it's now already a de-facto unmaintained driver


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 02/22] Firmware interface code for IB device.

2006-02-18 Thread Arjan van de Ven
On Sat, 2006-02-18 at 08:32 -0800, Roland Dreier wrote:
 Arjan The bigger issue is: if people can't be bothered to do
 Arjan those steps, why would they be bothered to do this for
 Arjan maintenance and bugfixes etc etc?  Basically it's now
 Arjan already a de-facto unmaintained driver
 
 I don't think that's really a fair statement.

It's a concern at least; if they're just having trouble posting really
big files that's one thing.. if they're not allowed to post at all
that's another.

 IBM people: can you clarify the restrictions you have?  Why do you
 feel you can't post your own driver for review?  Will you be able to
 post smaller patches to lkml in the future if the driver is merged?

And can you respond to questions and user questions on lkml?


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [PATCH 2/5] [RFC] Infiniband: connection abstraction

2006-01-19 Thread Arjan van de Ven
On Thu, 2006-01-19 at 11:49 -0800, Bryan O'Sullivan wrote:
 On Thu, 2006-01-19 at 11:34 -0800, Sean Hefty wrote:
 
  Here was the suggestion that I received as a change to the license:
  
  licenses.  You may choose to be licensed under the terms of the GNU General 
  Public License (GPL) Version 2, available from the file COPYING in the main 
  directory of this source tree, or, when using this code outside the context 
  of 
  the linux kernel or other GPL works, the OpenIB.org BSD license below
  
  The differences between this version and the one used by OpenIB is slight, 
  but 
  it does clarify that the version in Linux is GPL, which is the case.
 
 That seems reasonable to me.  Arjan?

To me it's a very good clarification of the situation and avoids a
situation where people get the wrong idea. So I'm all for it.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 2/5] [RFC] Infiniband: connection abstraction

2006-01-17 Thread Arjan van de Ven

 
 flamebait
 Also should infiniband exports be EXPORT_SYMBOL_GPL, to make
 it clear that binary drivers for this are not allowed??
 /flamebait

the dual license text needs a bit of clarification I suspect to make
explicit that the or BSD part only applies when used entirely outside
the linux kernel. (that already is the case, just it's not explicit.
Making that explicit would be good).



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 0 of 20] [RFC] ipath - PathScale InfiniPath driver

2006-01-03 Thread Arjan van de Ven
On Tue, 2006-01-03 at 12:54 -0800, Bryan O'Sullivan wrote:
 On Tue, 2006-01-03 at 09:27 -0800, Greg KH wrote:
 
  Idealy, nothing should be new ioctls.  But in the end, it all depends on
  exactly what you are trying to do with each different one.
 
 Fair enough.
 
  I really don't know what the subnet management stuff involves, sorry.
  But doesn't the open-ib layer handle that all for you already?
 
 It does when our OpenIB driver is being used.  But our lower level
 driver is independent of OpenIB (and is often used without the
 infiniband stuff even configured into the kernel), and needs to provide
 some way for a userspace subnet management agent to send and receive
 packets.

that sounds like your driver should mimic the openIB userspace ABI for
this *exactly* so that you can use the same management tools for either
scenario...


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 10 of 20] ipath - core driver, part 3 of 4

2005-12-31 Thread Arjan van de Ven
On Fri, 2005-12-30 at 15:50 -0800, Bryan O'Sullivan wrote:
 On Fri, 2005-12-30 at 10:46 -0800, Linus Torvalds wrote:
 
  All your user page lookup/pinning code is terminally broken.
 
 Yes, this has been pointed out by a few others.
 
  Crap like this must not be merged.
 
 I'm already busy decrappifying it...

the point I think also was the fact that it exists is already wrong :)

makes it easier for you.. rm is a very powerful decrappify tool, as is
block delete in just about any editor ;)


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 12 of 20] ipath - misc driver support code

2005-12-30 Thread Arjan van de Ven
 +int ipath_get_upages_nocopy(unsigned long start_page, struct page **p)
 +{
 + int n;
 + struct vm_area_struct *vm = NULL;
 +
 + down_read(current-mm-mmap_sem);
 + n = get_user_pages(current, current-mm, start_page, 1, 1, 1, p, vm);
 + up_read(current-mm-mmap_sem);
 + if (n != 1) {
 + _IPATH_INFO(get_user_pages for 0x%lx failed with %d\n,
 + start_page, n);
 + if (n  0)  /* it's an errno */
 + return n;
 + /*
 +  * If we ever ask for more than a single page, we will have to
 +  * free the pages (if any) that we did get, via 
 ipath_get_upages()
 +  * or put_page() directly.
 +  */
 + return -ENOMEM; /* no way to know actual error */
 + }
 + vm-vm_flags |= VM_SHM | VM_LOCKED;
 +
 + return 0;
 +}


I hope you're not depending on the VM_LOCKED thing.. since the user can
just undo that easily!

(this is also why all this sys_mlock from the driver is traditionally
buggy to the point of being a roothole, things like some of the binary
3D drivers have had this security hole for a long time, as did some of
the early infiniband drivers)

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH 01/13] [RFC] ipath basic headers

2005-12-17 Thread Arjan van de Ven
On Sat, 2005-12-17 at 14:19 -0800, Robert Walsh wrote:
   +{
   + void *ssv, *dsv;
   + uint32_t csv;
   + __asm__ __volatile__(cld\n\trep\n\tmovsb:=c(csv), =D(dsv),
   +  =S(ssv)
   +  :0(cnt), 1(dest), 2(src)
   +  :memory);
   +}
  
  No way we're gonna put assembler code into such a driver.
 
 Why not?  The chip (and therefore the driver) only works with Opterons.
 It's tied to the HT bus, but PCI or anything like that.

and opterons can already run 2 architectures. And the HT bus is a
generic bus.. with public specs. Others than just AMD use it as well.

also.. what is wrong with memcpy and co ?

   +static __inline__ uint32_t ipath_kget_kreg32(const ipath_type stype,
   +  ipath_kreg regno)
   +{
   + volatile uint32_t *kreg32;
   +
   + if (!devdata[stype].ipath_kregbase)
   + return ~0;
   +
   + kreg32 = (volatile uint32_t *)devdata[stype].ipath_kregbase[regno];
  
  volatile use is probably always wrong. but this whole functions looks like
  a very odd wrapper anyway?
 
 The volatile is there so the compiler doesn't optimize away the read.
 This is important, because reads of our hardware have side-effects and
 cannot be optimized out.

then you need to use readl() and family most like; they already take
care of this anyway.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [RFC] Move InfiniBand .h files

2005-08-04 Thread Arjan van de Ven
On Thu, 2005-08-04 at 11:26 -0700, Grant Grundler wrote:
 On Thu, Aug 04, 2005 at 07:53:58PM +0200, Arjan van de Ven wrote:
  On Thu, 2005-08-04 at 10:32 -0700, Roland Dreier wrote:
   I would like to get people's reactions to moving the InfiniBand .h
   files from their current location in drivers/infiniband/include/ to
   include/linux/rdma/.  If we agree that this is a good idea then I'll
   push this change as soon as 2.6.14 starts.
  
  please only put userspace clean headers here; the rest is more or less
  private headers for your subsystem. 
 
 Sorry...this smells like a rathole...but does this mean
 linus agrees the kernel subsystems should export headers suitable for
 both user space and kernel driver modules?
 
 Historical, I thought glibc and other user space libs were expected to
 maintain their own set of header files. Maybe I'm just confused...

there is a definite requirement for the kernel to expose SOME things to
userspace. Well for SOMETHING to expose them. Right now most distros
ship a hacked up version of the kernel headers (eg removed of all the
kernel specific stuff and all the gpl inline code etc). A good part of
making such an external project possible is to make a clean separation
between userspace shared stuff and pure kernel internals.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [RFC] Move InfiniBand .h files

2005-08-04 Thread Arjan van de Ven
On Thu, 2005-08-04 at 11:57 -0700, Roland Dreier wrote:
 Roland Also, drivers/infiniband/include doesn't get put into the
 Roland /lib/modules/ver/build directory,
 
 Arjan that is a symlink not a directory, and a symlink to the
 Arjan full source...
 
 Sorry, I was too terse about the problem.  You're right, but typical
 distros don't ship full kernel source in their support kernel builds
 package.

so what makes you think they will ship include/infiniband ?


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-04-22 Thread Arjan van de Ven
On Fri, 2005-04-22 at 12:55 -0500, Timur Tabi wrote:
 Arjan van de Ven wrote:
  On Mon, 2005-04-18 at 11:09 -0500, Timur Tabi wrote:
  
 Roland Dreier wrote:
 
 Troy How is memory pinning handled? (I haven't had time to read
 Troy all the code, so please excuse my ignorance of something
 Troy obvious).
 
 The userspace library calls mlock() and then the kernel does
 get_user_pages().
 
 Why do you call mlock() and get_user_pages()?  In our code, we only call 
 mlock(), and the 
 memory is pinned. 
  
  
  this is a myth; linux is free to move the page about in physical memory
  even if it's mlock()ed!!
 
 Can you tell me when Linux actually does this?  I know in theory it can 
 happen, but I've 
 never seen it.  Does the code to implement moving of data from one physical 
 page to 
 another even exist in any version of Linux?

hot(un)plug memory.

 
 Also, what would be the point?  What reason would there be to move some data 
 from one 
 physical page to another, while keeping the same virtual address?

so that you can hot unplug the dimm in question.

I guess that's a bit of a high end though though... so maybe you don't
care about it.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-04-21 Thread Arjan van de Ven
On Thu, 2005-04-21 at 13:25 -0700, Chris Wright wrote:
 * Timur Tabi ([EMAIL PROTECTED]) wrote:
  It works with every kernel I've tried.  I'm sure there are plenty of kernel 
  configuration options that will break our driver.  But as long as all the 
  distros our customers use work, as well as reasonably-configured custom 
  kernels, we're happy.
  
 
 Hey, if you're happy (and, as you said, you don't intend to merge that
 bit), I'm happy ;-)


yeah... drivers giving unprivileged processes more privs belong on
bugtraq though, not in the core kernel :)


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-04-18 Thread Arjan van de Ven
On Mon, 2005-04-18 at 11:09 -0500, Timur Tabi wrote:
 Roland Dreier wrote:
  Troy How is memory pinning handled? (I haven't had time to read
  Troy all the code, so please excuse my ignorance of something
  Troy obvious).
  
  The userspace library calls mlock() and then the kernel does
  get_user_pages().
 
 Why do you call mlock() and get_user_pages()?  In our code, we only call 
 mlock(), and the 
 memory is pinned. 

this is a myth; linux is free to move the page about in physical memory
even if it's mlock()ed!!

And even then, the user can munlock the memory from another thread etc
etc. Not a good idea.

get_user_pages() is used from AIO and other parts of the kernel for
similar purposes and in fact is designed for it, so it better work. If
it has bugs those should be fixed, not worked around!


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-04-18 Thread Arjan van de Ven
On Mon, 2005-04-18 at 11:25 -0500, Timur Tabi wrote:
 Arjan van de Ven wrote:
 
  this is a myth; linux is free to move the page about in physical memory
  even if it's mlock()ed!!
 
 Then Linux has a very odd definition of the word locked.
 
  And even then, the user can munlock the memory from another thread etc
  etc. Not a good idea.
 
 Well, that's okay, because then the app is doing something stupid, so we 
 don't worry about 
 that.

you should since that physical page can be reused, say by a root
process, and you'd be majorly screwed


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation

2005-04-18 Thread Arjan van de Ven
On Mon, 2005-04-18 at 15:00 -0500, Timur Tabi wrote:
 Arjan van de Ven wrote:
 
  you should since that physical page can be reused, say by a root
  process, and you'd be majorly screwed
 
 I don't understand what you mean by reused.  The whole point behind pinning 
 the memory 
 is that it stays where it is.  It doesn't get moved around and it doesn't get 
 swapped out.
 
you just said that you didn't care that it got munlock'd. So you don't
care that it gets freed either. And then reused.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general