RE: [openib-general] How do we prevent starvation say between TCP overIPOIB / and SRP traffic ?

2006-04-19 Thread Diego Crupnicoff

> 
> To manage QoS the question is who knows about all the traffic
traversing a
> specific adapter. For most kernel traversing protocols (IP, iSER,
iSCSI,
> etc) you can sometimes do this in the device driver, where you can
examine
> the headers as a packet is expedited and manage it there.
Unfortunately
> you
> are adding processing in the driver which can end up impacting
bandwidth
> on
> high speed adapters. You also introduce additional overhead hence
higher
> CPU utilization.

Right. You do not want to do this in SW. Most IB HCA adapters can do
this for you at wire speed with absolutely no toll on host CPU
utilization. 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] mthca FMR correctness (and memory windows)

2006-03-20 Thread Diego Crupnicoff
Title: RE: [openib-general] mthca FMR correctness (and memory windows)







> -Original Message-
> From: Talpey, Thomas [mailto:[EMAIL PROTECTED]]
> Sent: Monday, March 20, 2006 11:19 PM
> To: Diego Crupnicoff; Roland Dreier
> Cc: openib-general@openib.org
> Subject: RE: [openib-general] mthca FMR correctness (and memory windows)
> 
> At 08:42 PM 3/20/2006, Diego Crupnicoff wrote:
> >> If I can snoop or guess rkeys (not a huge challenge with 32 bits), and
> >> if I can use them on an arbitrary queuepair, then I can handily peek
> and
> >> poke at memory that does not belong to me.
> >
> >No. You can't get to the Window from an arbitrary QP. Only from those QPs
> that belong to the same PD.
> 
> 
> Oh yeah, I have to guess the PD too.


I do not think you can effectively guess a PD. PD is a privileged field that gets set at QP creation time (as opposed to rkeys that are postwqe input modifiers). The side exposing the memory has to "agree" to open a QP on a PD in order to make windows bound to that PD accessible through it.

> 
> >> For this reason, iWARP requires its steering tags to be scoped to a
> single
> >> connection. This leverages the IP security model and provides
> correctness.
> >>
> >> It is true that IB implementations generally don't do this. They
> should.
> >
> >IB allows the 2 flavors (PD bound Windows aka type 1, and QP bound
> Windows aka type 2).
> 
> Does mthca? I thought it's all type 1.


As noted before, currently mthca does not support memory windows at all.


> 
> Tom.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] mthca FMR correctness (and memory windows)

2006-03-20 Thread Diego Crupnicoff
Title: RE: [openib-general] mthca FMR correctness (and memory windows)






> >
> >Can you elaborate on the issue of fungibility?  If one entity has two
> >QPs, one of which it's using for traffic and one of which it's using
> >for MW binds, I don't see any security issue (beyond the fact that
> >you've now given up ordering of operations between the QPs).
> 
> If I can snoop or guess rkeys (not a huge challenge with 32 bits), and
> if I can use them on an arbitrary queuepair, then I can handily peek and
> poke at memory that does not belong to me.


No. You can't get to the Window from an arbitrary QP. Only from those QPs that belong to the same PD.


> 
> For this reason, iWARP requires its steering tags to be scoped to a single
> connection. This leverages the IP security model and provides correctness.
> 
> It is true that IB implementations generally don't do this. They should.


IB allows the 2 flavors (PD bound Windows aka type 1, and QP bound Windows aka type 2).


> 
> Tom.
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-
> general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [RFC] [PATCH] SRQ API

2005-06-16 Thread Diego Crupnicoff
Title: RE: [openib-general] [RFC] [PATCH] SRQ API







> -Original Message-
> From: Caitlin Bestler [mailto:[EMAIL PROTECTED]] 
> Sent: Wednesday, June 15, 2005 6:49 PM
> To: Roland Dreier
> Cc: openib-general@openib.org
> Subject: Re: [openib-general] [RFC] [PATCH] SRQ API
> 
> 


... 


> per RQ soft high-watermark (event when one QP allocates more 
> than N uncompleted buffers) and a 
> per RQ hard high-watermark (connection terminated when more 
> than N buffers required for one QP). The first two are 
> defined for iWARP/RDMAC, the third is optional under RNIC-PI.
> 


In IB, at any given time, one QP will never have more than 1 receive WQE allocated. So these last two events would never trigger on an IB implementation.

Diego






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] performance counters in /sys

2005-05-19 Thread Diego Crupnicoff
Title: RE: [openib-general] performance counters in /sys






> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]] 
> Sent: Thursday, May 19, 2005 6:48 PM
> To: Yaron Haviv
> Cc: Mark Seger; openib-general@openib.org
> Subject: RE: [openib-general] performance counters in /sys
> 
> 
> On Thu, 2005-05-19 at 16:45, Yaron Haviv wrote:
> > I believe you can use the per VL counters for that
> > (IB allows counting traffic on a specific VL)
> > By matching ULPs to VLs (e.g. through the ib_at lib we suggested)
> > You can get both congestion isolation per traffic type as 
> well as the
> > ability to count traffic per ULP 
> > (note that up to 8 VLs are supported in the Mellanox chips)
> 
> PortXmitDataVL[n], PortRcvDataVL[n], PortXmitPktVL[n], and 
> PortRcvPktVL[n] are all IB optional. Do the Mellanox HCAs 
> support these counters ?
> 
> -- Hal


No



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace v erbs implementation

2005-04-28 Thread Diego Crupnicoff
Title: RE: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation






> The userspace library should be able to track the tree and
> the overlaps, etc.  Things might become interesting when the 
> memory is MAP_SHARED pagecache and multiple independent 
> processes are involved, although I guess that'd work OK.
> 
> But afaict the problem wherein part of a page needs
> VM_DONTCOPY and the other part does not cannot be solved.


Not sure it was such a good idea, but at some point we thought of forcing the copy of (***only***) such pages at fork time (leaving of course the original one for the parent). This eliminates the COW that would have messed the parent's mapping, and still allows the child process to access the "un-registered" portions of the page.

BTW: We did try to "motivate" applications to do whole page registrations only so to avoid this issue altogether. But that did not work. Some (hard to ignore) applications want byte granularity.








___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] SRQ field in CM messages

2005-01-06 Thread Diego Crupnicoff
Title: RE: [openib-general] SRQ field in CM messages





This bit was added to the CM protocol so that the remote side QP can distinguish between a SRQ and a TCA that does not generate e2e credits.

Thanks,


Diego


> -Original Message-
> From: Sean Hefty [mailto:[EMAIL PROTECTED]] 
> Sent: Thursday, January 06, 2005 5:46 PM
> To: openib-general
> Subject: [openib-general] SRQ field in CM messages
> 
> 
> I've been coding the CM messages, and just setting the SRQ field in 
> them based on whether a QP has a SRQ.  My guess is that this 
> will work 
> fine, but my question is does anyone know why the CM or 
> remote QP cares 
> about this at all?  I want to make sure that I'm not missing 
> something 
> here.
> 
> - Sean
> ___
> openib-general mailing list
> openib-general@openib.org 
> http://openib.org/mailman/listinfo/openib-> general
> 
> To 
> unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] ip over ib throughtput

2005-01-06 Thread Diego Crupnicoff
Title: Message



I feel 
like we are talking about different things here:
 
The 
***IP*** MTU is relevant for IPoIB performance because it determines the number 
of times that you are going to be hit by the per-packet overhead of the 
***host*** networking stack. My point was that the ***IP MTU*** will not be tied 
to the ***IB*** MTU if a connected mode IPoIB (or SDP) is used instead of 
the current IPoIB that uses IB UD transport service. The IB MTU would then 
be irrelevant to this discussion.
 
As for 
the eventual 2G ***IP*** MTU limit, it still sounds more than reasonable to me. 
I wouldn't mind if a 10TB file gets split into some IP packets up to 2GB?!?!? 
each.
 
(With 
the exception of the UD transport service where IB messages are limited to be 
single packet), the choice of ***IB*** MTU and its impact on performance is a 
completely unrelated issue. IB messages are split into packets and reassembled 
by the HCA HW. So the per-IB-message overhead of the SW stack is 
independent of the IB MTU. The choice of IB MTU may indeed affect performance 
for other reasons but it is not immediately obvious that the largest available 
IB MTU is the best option for all cases. For example, latency optimization of 
small high priority packets under load may benefit from smaller IB MTUs (e.g. 
256).
 
Diego
 

  
  -Original Message-From: Stephen Poole 
  [mailto:[EMAIL PROTECTED] Sent: Thursday, January 06, 2005 5:45 
  AMTo: Diego CrupnicoffCc: 
  'openib-general@openib.org'Subject: RE: [openib-general] ip over ib 
  throughtput
  Have you done any "load" analysis of a 2K .vs. 4K MTU ? Your analogy of 
  having 2G as a total message size is potentially flawed. You seem to assume 
  that 2G is the end-all in size, it is not. What about when you want to (down 
  the road) use IB for files in the 1-10TB in size. Granted, we can live with 
  2G, but it is not some nirvana number. Second, with the 2G limit on messages 
  sizes, only determines the upper bound in overall size, I could send 2G @ 
  32bytes MTU. So, the question is, how much less of a system load/impact would 
  a 4K MTU be over a 2K MTU. Remember, even Ethernet finally decided to go to 
  Jumbo Frames, why, system impact and more. Remember HIPPI/GSN, the MTU was 
  64K, reason, system impact. The numbers I have seen running IPoIB really 
  impact the system.
  
  Steve...
  
  At 10:38 AM -0800 1/5/05, Diego Crupnicoff wrote:
  Note however that the relevant 
IB limit is the max ***message size*** which happens to be equal to the 
***IB*** MTU for the current IPoIB (that runs on top of IB UD transport 
service where IB messages are limited to a single 
  packet).
  A connected mode IPoIB (that 
runs on top of IB RC/UC transport service) would allow IB messages up to 2GB 
long. That will allow for much larger (effectively as large as you may ever 
dream of) ***IP*** MTUs, regardless of the underlying IB 
  MTU.
  Diego
  > -Original 
Message-> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]]> Sent: Wednesday, January 05, 2005 2:21 PM> To: Peter Buckingham> Cc: 
openib-general@openib.org> Subject: Re: 
[openib-general] ip over ib throughtput>>> On 
Wed, 2005-01-05 at 12:23, Peter Buckingham wrote:> > stupid question: why are we limited to a 2K MTU for 
IPoIB?>> The IB max 
MTU is 4K. The current HCAs support a max MTU of 2K.>> -- Hal>> 
___> 
openib-general mailing list> 
openib-general@openib.org> http://openib.org/mailman/listinfo/openib-> general>> To> unsubscribe, please 
visit> http://openib.org/mailman/listinfo/openib-general>
  ___openib-general 
mailing 
listopenib-general@openib.orghttp://openib.org/mailman/listinfo/openib-generalTo 
unsubscribe, please visit 
  http://openib.org/mailman/listinfo/openib-general
  
  -- 

  Steve Poole ([EMAIL PROTECTED])   
      
      
      
      Office: 
  505.665.9662Los Alamos National 
  Laboratory  
      
      
      
      
  Cell:    505.699.3807CCN - Special Projects / 
  Advanced Development  
      
      
  Fax:    505.665.7793P.O. Box 1663, MS B255Los 
  Alamos, NM. 8754503149801S
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] ip over ib throughtput

2005-01-05 Thread Diego Crupnicoff
Title: RE: [openib-general] ip over ib throughtput






> -Original Message-
> From: Woodruff, Robert J [mailto:[EMAIL PROTECTED]] 
> Sent: Wednesday, January 05, 2005 3:53 PM
> To: Diego Crupnicoff; Hal Rosenstock; Peter Buckingham
> Cc: openib-general@openib.org
> Subject: RE: [openib-general] ip over ib throughtput
> 
> 
> The trade off is that a fully connected model requires each 
> node to burn a QP for every node in the cluster and thus does 
> not scale as well as the UD model.  My guess is that if 
> people need really high performance 
> socket access, they will use SDP instead.
> 
> woody
> 


1 QP for every node in the cluster does not sound that bad. 


SDP is a good alternative too. It has even further benefits as compared to IPoIB (built in HW reliability that eliminates the TCP/IP stack, potential for zero copy, etc). However, in terms of QP requirements, SDP would consume even more than what a connected mode IPoIB would (still not too bad given the IB HW capabilities).

Diego





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] ip over ib throughtput

2005-01-05 Thread Diego Crupnicoff
Title: RE: [openib-general] ip over ib throughtput





Note however that the relevant IB limit is the max ***message size*** which happens to be equal to the ***IB*** MTU for the current IPoIB (that runs on top of IB UD transport service where IB messages are limited to a single packet).

A connected mode IPoIB (that runs on top of IB RC/UC transport service) would allow IB messages up to 2GB long. That will allow for much larger (effectively as large as you may ever dream of) ***IP*** MTUs, regardless of the underlying IB MTU.

Diego


> -Original Message-
> From: Hal Rosenstock [mailto:[EMAIL PROTECTED]] 
> Sent: Wednesday, January 05, 2005 2:21 PM
> To: Peter Buckingham
> Cc: openib-general@openib.org
> Subject: Re: [openib-general] ip over ib throughtput
> 
> 
> On Wed, 2005-01-05 at 12:23, Peter Buckingham wrote:
> > stupid question: why are we limited to a 2K MTU for IPoIB?
> 
> The IB max MTU is 4K. The current HCAs support a max MTU of 2K.
> 
> -- Hal
> 
> ___
> openib-general mailing list
> openib-general@openib.org 
> http://openib.org/mailman/listinfo/openib-> general
> 
> To 
> unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] Add support for querying port width/ speed

2004-12-28 Thread Diego Crupnicoff
Title: RE: [openib-general] [PATCH] Add support for querying port width/speed





I think the choice of IPD is beyond the scope of the IB spec. The spec provides a mechanism to regulate the injection rate but does not mandate its use. Your assumption below (rounding up) makes perfect sense. Rounding down would defeat the entire purpose of the IPD mechanism.

Diego


> -Original Message-
> From: Roland Dreier [mailto:[EMAIL PROTECTED]] 
> Sent: Tuesday, December 28, 2004 3:11 AM
> To: openib-general@openib.org
> Subject: Re: [openib-general] [PATCH] Add support for 
> querying port width/speed
> 
> 
> By, the way does anyone know what IPD is supposed to be used 
> when the injection port's rate is not a multiple of the rate 
> for the path?  For example, suppose an 8X DDR port is sending 
> to a 12X path -- 12 doesn't divide 16 evenly.
> 
> I would guess that the correct thing to do is round up, so 8X DDR
> -> 12X uses an IPD of 1, 8X QDR -> 12X uses an IPD of 2, etc.
> 
>  - R.
> 
> ___
> openib-general mailing list
> openib-general@openib.org 
> http://openib.org/mailman/listinfo/openib-> general
> 
> To 
> unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general