Re: [ewg] some RDS related issues in OFED 1.3

2007-10-29 Thread Or Gerlitz
On 10/10/07, Or Gerlitz <[EMAIL PROTECTED]> wrote:
>
> Hi Vlad,
>
> Opening a bug on the ofa bugzilla whose component is RDS results in
> [EMAIL PROTECTED] being assigned to it ...
>
> I understand it should be assigned to you? if yes, can you fix it?
>
> For the mean time I have re-assigned the two RDS problems reported by
> Voltaire (https://bugs.openfabrics.org/show_bug.cgi?id=724 and id=723)
> to you, where 724 is actually quite disturbing since RDS can not be used
> without a following oops on the module unload.



Hi Vlad, any update on that?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] [PATCH] ofed_scripts: Add location code fix for older ppc64 kernels

2007-10-29 Thread Joachim Fenkes
Kernels prior to 2.6.24 have problems with multiple devices sharing the same
location code on ppc64 systems -- only one of these devices would be usable
by ibmebus. This will be a problem on systems with multiple eHCA chips on a
single hardware location.

For older kernels, this problem can be circumvented by, prior to loading the
eHCA driver, changing the location codes of the offending devices so that
they're not the same anymore.

This patch adds an openibd patch file which, if applied, will make openibd
change the location codes of eHCA adapters with the same location code.
ofed_patch.sh is changed so that it applies that patch if, and only if, it
is run on a ppc64 architecture and the kernel version implies that the
kernel has the ibmebus bug.

Signed-off-by: Joachim Fenkes <[EMAIL PROTECTED]>
---
 ofed_scripts/ofed_patch.sh  |   49 +++
 ofed_scripts/openibd-loc_code.patch |   43 ++
 2 files changed, 92 insertions(+), 0 deletions(-)
 create mode 100644 ofed_scripts/openibd-loc_code.patch

diff --git a/ofed_scripts/ofed_patch.sh b/ofed_scripts/ofed_patch.sh
index e1f039d..b254000 100755
--- a/ofed_scripts/ofed_patch.sh
+++ b/ofed_scripts/ofed_patch.sh
@@ -200,6 +200,44 @@ get_backport_dir()
 
 }
 
+need_openibd_loc_code_patch()
+{
+   local sub
+
+   if [ "$ARCH" != "ppc64" ]; then
+   return 1;
+   fi
+
+   case $KVERSION in
+   2.6.9-*.EL*)
+   sub=$(echo $KVERSION | cut -d"-" -f2 | cut -d"." -f1)
+   if [ $sub -lt 62 ]; then
+   return 0;
+   fi
+   ;;
+   2.6.16.*-*-*)
+   sub=$(echo $KVERSION | cut -d"." -f4 | cut -d"-" -f1)
+   if [ $sub -lt 53 ]; then
+   return 0;
+   fi
+   ;;
+   2.6.18-*.el5*)
+   sub=$(echo $KVERSION | cut -d"-" -f2 | cut -d"." -f1)
+   if [ $sub -lt 52 ]; then
+   return 0;
+   fi
+   ;;
+   2.6.*)
+   sub=$(echo $KVERSION | cut -d"." -f3 | cut -d"-" -f1 | tr -d 
[:alpha:][:punct:])
+   if [ $sub -lt 24 ]; then
+   return 0;
+   fi
+   ;;
+   esac
+
+   return 1;
+}
+
 # Apply patch
 apply_patch()
 {
@@ -253,6 +291,13 @@ apply_backport_patches()
 fi
 }
 
+apply_openibd_patches()
+{
+   if need_openibd_loc_code_patch; then
+   apply_patch ${CWD}/ofed_scripts/openibd-loc_code.patch
+   fi
+}
+
 # Apply patches
 patches_handle()
 {
@@ -288,6 +333,9 @@ EOF
 fi
 
BACKPORT_INCLUDES='-I${CWD}/kernel_addons/backport/'${BACKPORT_DIR}/include/
 fi
+
+# Apply openibd patches
+apply_openibd_patches $KVERSION
 
 
 #FIXME: why are these applied here? Move them to before backports?
@@ -399,6 +447,7 @@ main()
 
 #Set default values
 KVERSION=${KVERSION:-$(uname -r)}
+ARCH=${ARCH:-$(uname -m)}
 WITH_QUILT=${WITH_QUILT:-"yes"}
 WITH_PATCH=${WITH_PATCH:-"yes"}
 WITH_KERNEL_FIXES=${WITH_KERNEL_FIXES:-"yes"}
diff --git a/ofed_scripts/openibd-loc_code.patch 
b/ofed_scripts/openibd-loc_code.patch
new file mode 100644
index 000..43d70b4
--- /dev/null
+++ b/ofed_scripts/openibd-loc_code.patch
@@ -0,0 +1,43 @@
+--- a/ofed_scripts/openibd 2007-10-25 08:01:51.0 -0500
 b/ofed_scripts/openibd 2007-10-27 09:58:56.0 -0500
+@@ -538,6 +538,32 @@ if test -x /sbin/lspci && test -x /sbin/
+ fi
+ }
+ 
++fix_location_codes()
++{
++  # ppc64 only:
++  # Fix duplicate location codes on kernels where ibmebus can't handle 
them
++  if [ -d /proc/device-tree -a -f /proc/ppc64/ofdt ]; then
++  local i=1 phandle lcode len
++  # output all duplicate location codes and their devices
++  for attr in $(find /proc/device-tree -wholename "[EMAIL 
PROTECTED]/ibm,loc-code"); do
++  echo -e $(dirname $attr)"\t"$(cat $attr)
++  done | sort -k2 | uniq -f1 --all-repeated=separate | cut -f1 | 
while read dev; do
++  if [ -n "$dev" ]; then
++  # append an instance counter to the location 
code
++  phandle=$(hexdump -e '8 "%u"' $dev/ibm,phandle)
++  lcode=$(cat $dev/ibm,loc-code)-I$i
++  len=$(echo -n "$lcode" | wc -c)
++  # echo "$dev -> $lcode"
++  echo -n "update_property $phandle ibm,loc-code 
$len $lcode" > /proc/ppc64/ofdt
++  i=$(($i + 1))
++  else
++  # empty line means new group -- reset i
++  i=1
++  fi
++  done
++  fi
++}
++
+ rotate_log()
+ {
+ local log=$1
+@@ -694,6 +720,7 @@ start()
+ 
+ 

[ewg] Symantec Norton 36O, Enhanced Security Edition 29$, Save 59.95$ 0ff Retai|

2007-10-29 Thread Sharon Walton
cheapnewsoft . com
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] to be discussed at the developer conference

2007-10-29 Thread Or Gerlitz
(Assuming that the allocation of slots within the schedule to have 
enough time for Linux IB developers to discuss what ever they decide 
they need to would be taken care of) I'd like to check with people what 
we want to be on the agenda of these slots. My thinking for issues to 
discuss was:


1) the long time and endless threads related to the SA caching thing 
need to be there. Sean - I saw that you prepare a session, correct? will 
you presenting few possible designs?


2) as for IPoIB stateless offload - with Eli and Liran not planned to be 
there. Dror - do you intend to actually present the actual ipoib / core 
/ drivers related design and implementation? Also, personally, I felt 
that the 1-2 slides you delivered on Sonoma where way below what would 
let one understand in what features exactly the HW supports, and I don't 
want to be referred to under-NDA docs, lets just have you provide a 
clear description regarding large-send and checksum offloading. Same for 
the HW interrupt mitigation, can be nice if you explain the problem, the 
solution and spare few words how does this goes with NAPI. One more 
thing is the LRO staff - its a pure SW optimization, if you think this 
should be in the ipoib code, some justification materials can be helpful.


3) QoS - Sean, Dror, generally speaking, what where you thinking to 
discuss?


4) IPoIB connected mode UC support - Roland, can work on this start once 
the no-SRQ design/code is agreed and committed to a branch at your git? 
In previous discussions with Michael over this list he insisted that 
some "keep alive" probing mechanism must be implemented since the arp 
probes sent by the kernel neighboring subsystem are not enough the cover 
all cases and he suggested to use IB CM LAP messages etc for that. What 
are the open issues you can think on here? would you be able to present 
this?


5) IB 4K MTU - in IPoIB and elsewhere in the IB stack, same here, 
Roland, do you think a short session is needed or your comments, eg 
http://lkml.org/lkml/2007/9/13/308 & http://lkml.org/lkml/2007/9/14/173 
cover everything that need to be done? is there something to change at 
layers below IPoIB, what about SM implementations - does anyone see 
there possible required changes?


6) the netdev network batching RFCs - Krishna, Shirley, will someone 
from IBM can prepare a session to educate us on the matter and the status?


any more ideas?

Or.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [promoters] OpenFabrics Developer's Summit: feedback requested

2007-10-29 Thread Kanevsky, Arkady
Just want to raise my voice against parallel sessions.
We had done it a couple of times at Sonoma.
It does not matter how you divide talks between
tracks there is still a huge overlap.

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Johann George [mailto:[EMAIL PROTECTED] 
> Sent: Monday, October 29, 2007 2:52 AM
> To: Or Gerlitz
> Cc: [EMAIL PROTECTED]; 
> ewg@lists.openfabrics.org; [EMAIL PROTECTED]
> Subject: [promoters] OpenFabrics Developer's Summit: feedback 
> requested
> 
> Or and Jeff,
> 
> Thanks again for your input.  I like Or's idea of starting 
> the summit earlier but am concerned as to whether people 
> could attend.  I'm also not sure we could have access to the 
> room earlier although I suspect that will be possible.
> 
> Regarding parallel tracks, we currently do not have another 
> room to handle that.  But we can investigate if this might be 
> possible at reasonable cost.
> 
> I would like to hear from the attendees since this summit is 
> for you.  Perhaps you can vote on the following three 
> questions.  I'll tally the votes that come from registered 
> attendees by the end of the week and act on them as best as I 
> can.  This might be a good time to remind you that if you 
> have not registered, please do so by following this link:
> 
> http://www.acteva.com/booking.cfm?bevaid=143964
> 
> (1) Are you willing and able to attend if we start at
> 11:00am on Thursday rather than at 1:00pm?
> 
> (2) If we are able to, would you prefer to see simultaneous
> tracks and lengthen some of the sessions.
> 
> (3) Would you like to see additional MPI sessions crammed
> into the allotted time?
> 
> To avoid polluting all the mailing lists, feel free to reply 
> just to me unless you wish to do otherwise.
> 
> Thanks.
> 
> Johann
> ___
> promoters mailing list
> [EMAIL PROTECTED]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/promoters
> 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Reminder OFED teleconfrence meeting today

2007-10-29 Thread Tziporet Koren


Friendly reminder: the OFED teleconference today Monday (29 Oct, 2007).

All are at noon US eastern / 9am US Pacific / -=>6pm Israel<=-
Monday, Oct 29, code 210630781


Dial-in information:

US/Canada:  +1.866.432.9903
India:  +91.80.4103.3979
Israel: +972.9.892.7026
Others: http://cisco.com/en/US/about/doing_business/conferencing/


Tziporet (for Jeff that is on the plane)

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Compile error in latest daily build

2007-10-29 Thread Hal Rosenstock
On Thu, 2007-10-25 at 12:13 +0200, Yevgeny Kliteynik wrote:
> Done.

Was this fixed in the ibutils git repo too ?

-- Hal

> 
> -- Yevgeny
> 
> Hal Rosenstock wrote:
> > On Wed, 2007-10-24 at 13:08 -0700, Woodruff, Robert J wrote:
> >> I ran into this compile error in today's daily build,
> >> not sure who this should be assigned to...
> >>
> >> Running rpm -iv
> >> /root/OFED-1.3-20071024-0645/RPMS/redhat-release-4AS-5.5/sdpnetstat-1.60
> >> -0.1.ofed20070909.x86_64.rpm
> >> Build srptools RPM
> >> ibis_wrap.c: In function `_wrap_sacClassPortInfo_resp_time_val_set':
> >> ibis_wrap.c:22491: error: structure has no member named `resp_time_val'
> >> ibis_wrap.c:22491: warning: left-hand operand of comma expression has no
> >> effect
> >> ibis_wrap.c: In function `_wrap_sacClassPortInfo_resp_time_val_get':
> >> ibis_wrap.c:22543: error: structure has no member named `resp_time_val'
> >> make[3]: *** [ibis_wrap.lo] Error 1
> >> make[3]: Leaving directory
> >> `/var/tmp/OFED_topdir/BUILD/ibutils-1.2/ibis/src'
> > 
> > The need to update ibutils was identified by Sasha the other day but I
> > guess this wasn't done yet. I'm not sure whether Eitan is around. Maybe
> > Yevgeny can do this.
> > 
> > -- Hal
> > 
> >> ___
> >> ewg mailing list
> >> ewg@lists.openfabrics.org
> >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
> > 
> 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] ofed_kernel merged with 2.6.24-rc1 patches update required

2007-10-29 Thread Vladimir Sokolovsky

Hello,
There is a new branch "ofed_kernel_2_6_24_rc1" under 
git://git.openfabrics.org/ofed_1_3/linux-2.6.git

All patches from kernel_patches/fixes that were applied in 2.6.24-rc1 were 
removed from kernel_patches/fixes directory.
The "problematic" patches from kernel_patches/fixes were moved to the 
kernel_patches/attic directory.

Backport patches and fixes should be updated according to the new kernel tree.
The easy way to do so is using "ofed_scripts/ofed_makedist.sh" utility which 
creates tgz file for every supported kernel with all relevant patches applied.

We want to move to the new branch on this Wednesday (31 Oct 2007)
Please send me updated backport patches and fixes by tomorrow.


Regards,
Vladimir
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: OpenFabrics Developer's Summit: feedback requested

2007-10-29 Thread Johann George
Someone proposed a fourth option worth considering which is staying
later on Friday.  Here are the alternatives we are looking for
feedback on:

(1) Are you willing and able to attend if we start at
11:00am on Thursday rather than at 1:00pm?

(2) If we are able to, would you prefer to see simultaneous
tracks and lengthen some of the sessions.

(3) Would you like to see additional MPI sessions crammed
into the allotted time?

(4) Are you willing and able to stay if we ran later on
Friday?  How long?

Thanks.

Johann
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] iw_cxgb3 genalloc memory allocator dependency

2007-10-29 Thread Steve Wise
The iw_cxgb3 module depends on the linux kernel genalloc service.  This 
service gets compiled into the kernel _only_ if another subsystem has a 
config dependency on the genalloc module (CONFIG_GENERIC_ALLOCATOR). In 
addtion, there are only two users of this service:  iw_cxgb3 and some 
IA64 subsystem.  So on a kernel.org kernel that has iw_cxgb3, genalloc 
gets built into the kernel when you enable the iw_cxgb3 module.  But on 
non IA64 platforms that do not have iw_cxgb3 configured in, the genalloc 
code is not pulled into the kernel.


The side affect of this is that if one tries to compile OFED on a 
kernel.org kernel that doesn't have iw_cxgb3 configured, the genalloc 
server is not available and ofed doesn't compile.


Now, ofed has a backport of genalloc to support older kernels that do 
not even have the genalloc service.  But we don't pull in that backport 
for kernels that do have genalloc.  Thus the problem...


I'm looking for suggestions on how and if we should do something about 
this?  Here are some ideas:


1) always build in our own genalloc service as a backport.  This solves 
the problem, but duplicates the code if it is indeed built into the kernel.


2) detect and ofed config time if we need the genalloc service or not. 
Then pull in the backport as needed.  This one is nice in that it won't 
replicate the gencalloc code when not needed, but at the expense of 
adding complexity to the configure script for ofed.  I'm not really sure 
how to do it at all.  But maybe vlad knows how?



Thoughts?

BTW: bug 767 opened to track this.

Thanks,

Steve.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] OFED meeting agenda (Oct 29) about beta readiness:

2007-10-29 Thread Tziporet Koren
This is the agenda for the OFED meeting about beta readiness:

1. Review beta tasks status:
1. Fix compilation problems on PPC with 32 bits - Vlad & Oren
(Mellanox) - on work
2. Rebase kernel code on 2.6.24 rc1 (depending it's
availability)  - on work (please read mail from Vlad with instructions)
3. SPEC files should be part of each user space package - each
owner should take the spec file 
4. Multiple uDAPL libs (1.0 & 2.0) - Vlad and Arlin (Intel)
5. nes - need to update some backport patches Glenn (NetEffect)

Any new task ???

Done tasks:
*   Add qperf test from Qlogic - Johann (Qlogic) 
*   Support RHEL 5 up1 - Woody & Vlad
*   Apply patches that fix warning of backport patches - Vlad
(Mellanox) (one patch was not applied since we got no answer regarding
it)
*   New MVAPICH package - Pasha & DK (OSU)
*   Complete RDS work - Vlad (Mellanox)
*   Integrate all SDP features - Jim (Mellanox)




2. Open bugs - review most critical bugs:

bug_id  bug_severityop_sys  assigned_to short_short_desc
753 blocker SLES 10 [EMAIL PROTECTED]   ibutils src.rpm compile
error on SLES10 SP1 js21 PPC64  
744 blocker RHEL 4  [EMAIL PROTECTED]   OFED 1.3 Cheetah IPoIB
netperf UDP_STREAM fails, causes IPoIB to stop working  
757 blocker Other   [EMAIL PROTECTED]   ipoib cm - traffic does
not work over partioning interfaces when the mode is connected. 
756 criticalRHEL 5  [EMAIL PROTECTED]
OFED-1.3-20071024-0645 ibutils won't compile on RHEL4/RHEL5 
750 criticalSLES 10 [EMAIL PROTECTED]   Problem with
modprobe ib_ehca with older kernel versions 
746 criticalSLES 10 [EMAIL PROTECTED]   Installation of
32-bit libibverbs failed
758 criticalSLES 10 [EMAIL PROTECTED]   IPOIB_CM is not
compiled via install.pl 
760 major   All [EMAIL PROTECTED]   UDP performance on Rx is
lower than Tx   
761 major   Other   [EMAIL PROTECTED]   Poor and jittery UDP
performance at small messages   
508 major   RHEL 4  [EMAIL PROTECTED]   IPoIB CM multicast is
hogging interrupts  
751 major   RHEL 5  [EMAIL PROTECTED]   MVAPICH won't build
mpif77 and mpif90 with PGI 7.0  
736 major   Other   [EMAIL PROTECTED]   IBV_WC_RETRY_EXC_ERR
errors with local rdma_reads
730 major   RHEL 4  [EMAIL PROTECTED]   OFED 1.3 MPI won't
compile with PGI 6.2.5 on RHEL4 x86_64  
740 major   All [EMAIL PROTECTED]   OFED 1.3 install.pl is
missing functionality (OFA_KERNEL_PARAMS and K_VER) that install.sh had 
747 major   All [EMAIL PROTECTED]   SRPHA_ENABLE missing
from OFED 1.3 alpha2 openib.conf
733 normal  All [EMAIL PROTECTED]   create a CQ with
a number which is power of 2 will result waste of memory
762 normal  All [EMAIL PROTECTED]   create an XRC QP
with NULL in the xrc_domain causes kernel oops  
763 normal  All [EMAIL PROTECTED]   XRC domain can
be closed event QP/SRQ are using it 
689 normal  Other   [EMAIL PROTECTED]   When one install ofed
(including the mpi-selector) and choosing prefix that end with "/",
Install fails.  
755 normal  Other   [EMAIL PROTECTED]   openMPI src.rpm compile
error on SLES10 SP1 JS21 PPC64  
692 normal  Other   [EMAIL PROTECTED]   Ping over IPoIB
interface stops working when running openibd restart with bonding
enabled.
709 normal  All [EMAIL PROTECTED]   ibutils binaries
have wrong RPATH
765 normal  Other   [EMAIL PROTECTED]   ofed-1.3 and
ofed-1.2.5 can't burn mlx4 HCA's with old FWR (2.0.150) 
754 normal  SLES 10 [EMAIL PROTECTED]   mvapich2
src.rpm compile error on js21 PPC64 SLES10 SP1  
752 normal  Other   [EMAIL PROTECTED]   opensm daemon failed to
start   
721 normal  SLES 10 [EMAIL PROTECTED]   OFED 1.3 installation
failed: Failed to build opensm RPM  
723 normal  RHEL 4  [EMAIL PROTECTED]   netperf over rds failed
- rds_send: data send error: Invalid argument   
724 normal  SLES 10 [EMAIL PROTECTED]   Oops during rds module
unload  
739 normal  Other   [EMAIL PROTECTED]   install.pl doesn't want
to die  
742 normal  Other   [EMAIL PROTECTED]   mpi-selector not working
in 1.3-alpha2   
748 normal  Other   [EMAIL PROTECTED]   install failed  
764 normal  SLES 10 [EMAIL PROTECTED]   Installation bug,   
766 normal  SLES 10 [EMAIL PROTECTED]   Installation bug,   
690 minor   All [EMAIL PROTECTED]   Attempt is made to
install mvapich2 even when user says don't install it   

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Compile error in latest daily build

2007-10-29 Thread Yevgeny Kliteynik

Hal Rosenstock wrote:

On Thu, 2007-10-25 at 12:13 +0200, Yevgeny Kliteynik wrote:

Done.


Was this fixed in the ibutils git repo too ?


Yes, but the current ibutils maintainer is Oren Kladnitsky,
so you need to pull ibutils from here:

git://staging.openfabrics.org/~orenk/ibutils

--Yevgeny


-- Hal


-- Yevgeny

Hal Rosenstock wrote:

On Wed, 2007-10-24 at 13:08 -0700, Woodruff, Robert J wrote:

I ran into this compile error in today's daily build,
not sure who this should be assigned to...

Running rpm -iv
/root/OFED-1.3-20071024-0645/RPMS/redhat-release-4AS-5.5/sdpnetstat-1.60
-0.1.ofed20070909.x86_64.rpm
Build srptools RPM
ibis_wrap.c: In function `_wrap_sacClassPortInfo_resp_time_val_set':
ibis_wrap.c:22491: error: structure has no member named `resp_time_val'
ibis_wrap.c:22491: warning: left-hand operand of comma expression has no
effect
ibis_wrap.c: In function `_wrap_sacClassPortInfo_resp_time_val_get':
ibis_wrap.c:22543: error: structure has no member named `resp_time_val'
make[3]: *** [ibis_wrap.lo] Error 1
make[3]: Leaving directory
`/var/tmp/OFED_topdir/BUILD/ibutils-1.2/ibis/src'

The need to update ibutils was identified by Sasha the other day but I
guess this wasn't done yet. I'm not sure whether Eitan is around. Maybe
Yevgeny can do this.

-- Hal


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] to be discussed at the developer conference

2007-10-29 Thread Sean Hefty
1) the long time and endless threads related to the SA caching thing 
need to be there. Sean - I saw that you prepare a session, correct? will 
you presenting few possible designs?


I was asked to prepare a session and will mention some of the general 
scalability issues that we've seen with Intel MPI.


3) QoS - Sean, Dror, generally speaking, what where you thinking to 
discuss?


We plan on discussing what was added to the stack and opensm.

Keep in mind that both of these are only 20 minutes.

- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] Running netperf and iperf over SDP

2007-10-29 Thread Rick Jones

Moshe Kazir wrote:

While running netperf and iperf over SDP I get unstable Performance
results.

On iperf I get more then 25 % difference between minimum and maximum.

On netperf I get the following amazing result. ->

++
 # while   LD_PRELOAD=/usr/lib64/libsdp.so ./netperf  -H 192.168.7.172
-- -m 512 -M 1047 ; do echo . ; done
TCP STREAM TEST to 192.168.7.172
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

126976 12697651210.003247.39
.
TCP STREAM TEST to 192.168.7.172
Recv   SendSend
Socket Socket  Message  Elapsed
Size   SizeSize Time Throughput
bytes  bytes   bytessecs.10^6bits/sec

126976 12697651210.001222.48
.
++

Can some one spar me a hint ?
What I'm doing wrong ?


I would start by telling netperf you want CPU utilization measured.  That way we 
can see if there is a correlation between the CPU utilization and the throughput.


Also, if this were a "pure" TCP test, you would have a race between the nagle 
algorithm and the speed at which ACK's come-back from the receiver affecting the 
distribution of TCP segment sizes being transmitted since your send size is so 
much smaller than the MTU of the link.  IIRC for IPoIB in 1.2.mumble the MTU is 
65520 or something like that.  You might consider taking snapshots of the 
link-level statistics (does ethtool -S work for an IB interface?) from before 
and after each netperf test and run them through beforeafter:


ftp://ftp.cup.hp.com/dist/networking/tools/

You might also experiment with setting TCP_NODELAY - although since this is 
LD_PRELOADED SDP I'm not sure what that really means/does.


Any particular reason you are telling the netperf side to post 1047 byte 
receives when you are making 512 byte calls to send()?




The test run on x86_64 , sles 10 sp1 , OFED-1.2.5  


I'm guessing you have multiple cores - how do interrutps from the HCA (?) get 
distributed?  What happens when you use the -T option of netperf to vary the CPU 
binding of either netperf or netserver:


netperf -T N,M#bind netperf to CPU N, netserver to CPU M
netperf -T N, #just bind netperf to CPU N, netserver unbound
netperf -T  ,M#netperf unbound, netserver bound to CPU M

relative to where the interrupts from the HCA go?

Finally, well for now :), there are "direct" SDP tests in netperf.  Make sure 
you are on say 2.4.4:


http://www.netperf.org/svn/netperf2/tags/netperf-2.4.4/   or
ftp://ftp.netperf.org/netperf

and add --enable-sdb to the ./configure command.

happy benchmarking,

rick jones
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: ofed_kernel merged with 2.6.24-rc1 patches update required

2007-10-29 Thread Steve Wise



Vladimir Sokolovsky wrote:

Hello,
There is a new branch "ofed_kernel_2_6_24_rc1" under 
git://git.openfabrics.org/ofed_1_3/linux-2.6.git


All patches from kernel_patches/fixes that were applied in 2.6.24-rc1 
were removed from kernel_patches/fixes directory.
The "problematic" patches from kernel_patches/fixes were moved to the 
kernel_patches/attic directory.


Backport patches and fixes should be updated according to the new kernel 
tree.
The easy way to do so is using "ofed_scripts/ofed_makedist.sh" utility 
which creates tgz file for every supported kernel with all relevant 
patches applied.




Vlad,  have you done any builds against the various kernels?  What 
exactly should I, as cxgb3 owner, do with this branch other than verify 
the patches are correct?


Steve.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] Re: ofed_kernel merged with 2.6.24-rc1 patches update required

2007-10-29 Thread Vladimir Sokolovsky

Steve Wise wrote:



Vladimir Sokolovsky wrote:

Hello,
There is a new branch "ofed_kernel_2_6_24_rc1" under 
git://git.openfabrics.org/ofed_1_3/linux-2.6.git


All patches from kernel_patches/fixes that were applied in 2.6.24-rc1 
were removed from kernel_patches/fixes directory.
The "problematic" patches from kernel_patches/fixes were moved to the 
kernel_patches/attic directory.


Backport patches and fixes should be updated according to the new 
kernel tree.
The easy way to do so is using "ofed_scripts/ofed_makedist.sh" utility 
which creates tgz file for every supported kernel with all relevant 
patches applied.




Vlad,  have you done any builds against the various kernels?  What 
exactly should I, as cxgb3 owner, do with this branch other than verify 
the patches are correct?


Steve.


Currently some backport patches fails to be applied.
Please verify that cxgb3 backport patches can be applied and that under 
kernel_patches/fixes all required patches present.

Vladimir.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] to be discussed at the developer conference

2007-10-29 Thread Sean Hefty
7) the inform info code. Sean - you have implemented and attempted to 
push it through the sa caching push, but since the cache was rejected so 
did the inform info code. So the questions here - how do we make this 
push happen? are there any open issues, etc


There either needs to be an in kernel user, or we need to reach 
agreement on the best way to expose this to userspace.  Neither this, 
nor the multicast code are directly exported.


I have seen e-mails on the list that event subscription is used by 
userspace apps, but it is done via the MAD layer directly.


- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] to be discussed at the developer conference

2007-10-29 Thread Or Gerlitz
On 10/29/07, Or Gerlitz <[EMAIL PROTECTED]> wrote:
>
> (Assuming that the allocation of slots within the schedule to have
> enough time for Linux IB developers to discuss what ever they decide
> they need to would be taken care of) I'd like to check with people what
> we want to be on the agenda of these slots. My thinking for issues to
> discuss was:
> 
> any more ideas?
>

7) the inform info code. Sean - you have implemented and attempted to push
it through the sa caching push, but since the cache was rejected so did the
inform info code. So the questions here - how do we make this push happen?
are there any open issues, etc

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] Re: [ofa-general] to be discussed at the developer conference

2007-10-29 Thread Or Gerlitz
On 10/29/07, Sean Hefty <[EMAIL PROTECTED]> wrote:
>
> > 1) the long time and endless threads related to the SA caching thing
> > need to be there. Sean - I saw that you prepare a session, correct? will
> > you presenting few possible designs?
>
> I was asked to prepare a session and will mention some of the general
> scalability issues that we've seen with Intel MPI.

> 3) QoS - Sean, Dror, generally speaking, what where you thinking to
> > discuss?
>
> We plan on discussing what was added to the stack and opensm.
>
> Keep in mind that both of these are only 20 minutes.


Sean,

As you might saw over the thread "OpenFabrics Developer's Summit: tentative
agenda" I am working to get the Linux IB issues what ever time we need to
discuss them. You can assume at least 45 minutes (and if needed more) to the
SA caching so you can go much further then the problem description eg to
sketch few possible designs / implementations. This is a two years old open
issue which need to be solved. Similarily for QoS, I'd go further to discuss
open issues if there are such that you are aware to. Who's going to present
the opensm changes - you or Dror?

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] librdmacm 1.0.4 release

2007-10-29 Thread Sean Hefty
I've pushed out a release 1.0.4 of librdmacm that addresses some of the feedback
from Doug.  Patches were posted previously to the list, with a small update
based on that feedback.

Please pull this release into OFED 1.3.

Changes from 1.0.3:
librdmacm/cma: provide wrapper functions to extract src/dst addresses
librdmacm/cma: provide sanity checks for max outstanding rdma ops
librdmacm/man: update man pages to clarify connection request params

Thanks,
- Sean
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] Re: OpenFabrics Developer's Summit: feedback requested

2007-10-29 Thread Or Gerlitz
On 10/29/07, Johann George <[EMAIL PROTECTED]> wrote:
>
> (2) If we are able to, would you prefer to see simultaneous
> tracks and lengthen some of the sessions.


It makes some sense to make a poll if people prefer simultaneous tracks or
not, however if the answer is "no", still you can't allocate only 20 minutes
for Linux IB open issues. This means that there will be no simultaneous
tracks in the price of removing other things from the agenda.


> (3) Would you like to see additional MPI sessions crammed
> into the allotted time?


Getting input from COMMERCIAL MPIs need not be a subject to this or that
poll result,
moreover, as Jeff commented, in all the previous meetings MVAPICH and OMPI
people
were able to provide updates and feedback, getting updates from other MPIs
is more important
in this time frame.

Or.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [ofa-general] to be discussed at the developer conference

2007-10-29 Thread Dror Goldenberg

Or Gerlitz wrote:


2) as for IPoIB stateless offload - with Eli and Liran not planned to 
be there. Dror - do you intend to actually present the actual ipoib / 
core / drivers related design and implementation? Also, personally, I 
felt that the 1-2 slides you delivered on Sonoma where way below what 
would let one understand in what features exactly the HW supports, and 
I don't want to be referred to under-NDA docs, lets just have you 
provide a clear description regarding large-send and checksum 
offloading. Same for the HW interrupt mitigation, can be nice if you 
explain the problem, the solution and spare few words how does this 
goes with NAPI. One more thing is the LRO staff - its a pure SW 
optimization, if you think this should be in the ipoib code, some 
justification materials can be helpful.


Yes, I will try to do a better job this time :)


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [ofa-general] Re: OpenFabrics Developer's Summit: feedback requested

2007-10-29 Thread Kanevsky, Arkady
Johann,
please, do not schedule anything after Th 7:00pm iWARP session.
I expect it to go much longer than 1 hour.

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Johann George [mailto:[EMAIL PROTECTED] 
> Sent: Monday, October 29, 2007 11:08 AM
> To: [EMAIL PROTECTED]; 
> [EMAIL PROTECTED]; ewg@lists.openfabrics.org
> Subject: [ofa-general] Re: OpenFabrics Developer's Summit: 
> feedback requested
> 
> Someone proposed a fourth option worth considering which is 
> staying later on Friday.  Here are the alternatives we are 
> looking for feedback on:
> 
> (1) Are you willing and able to attend if we start at
> 11:00am on Thursday rather than at 1:00pm?
> 
> (2) If we are able to, would you prefer to see simultaneous
> tracks and lengthen some of the sessions.
> 
> (3) Would you like to see additional MPI sessions crammed
> into the allotted time?
> 
> (4) Are you willing and able to stay if we ran later on
> Friday?  How long?
> 
> Thanks.
> 
> Johann
> ___
> general mailing list
> [EMAIL PROTECTED]
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://openib.org/mailman/listinfo/openib-general
> 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ofa-general] Re: [ewg] to be discussed at the developer conference

2007-10-29 Thread Or Gerlitz

Sean Hefty wrote:
7) the inform info code. Sean - you have implemented and attempted to 
push it through the sa caching push, but since the cache was rejected 
so did the inform info code. So the questions here - how do we make 
this push happen? are there any open issues, etc


There either needs to be an in kernel user, or we need to reach 
agreement on the best way to expose this to userspace.  Neither this, 
nor the multicast code are directly exported.


IB multicast send-only (NonMemberSendOnly in IB spec notation) joins is 
the user that can enable the merge of the inform-info code.


Specifically, the in-kernel user I suggest is the rdma-cm: enhance the 
librdmacm api to let the consumer specify that they want a "send-only" 
join, for such joins have the rdma-cm register to "GID IN" event on this 
group MGID and once such event happens, do the actual join on the group.


How does this sounds?

I have seen e-mails on the list that event subscription is used by 
userspace apps, but it is done via the MAD layer directly.


Other than having each such app inventing the wheel in their inform-info 
low level coding, this is bad, since there is no reference counting and 
one process doing unregister makes the second process never get events 
(or they also implemented a reference counting daemon...), anyway, I 
think we want your implementation in, and the question is how we do that.


Or.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] Re: [ofa-general] to be discussed at the developer conference

2007-10-29 Thread Or Gerlitz

Dror Goldenberg wrote:
2) as for IPoIB stateless offload - with Eli and Liran not planned to 
be there. Dror - do you intend to actually present the actual ipoib / 
core / drivers related design and implementation? Also, personally, I 
felt that the 1-2 slides you delivered on Sonoma where way below what 
would let one understand in what features exactly the HW supports, and 
I don't want to be referred to under-NDA docs, lets just have you 
provide a clear description regarding large-send and checksum 
offloading. Same for the HW interrupt mitigation, can be nice if you 
explain the problem, the solution and spare few words how does this 
goes with NAPI. One more thing is the LRO staff - its a pure SW 
optimization, if you think this should be in the ipoib code, some 
justification materials can be helpful.


Yes, I will try to do a better job this time :)


Lets do it more concrete: please comment if you will be presenting the 
actual SW design and more important, how much time you think you need, 
20m is way below anything that allows for questions and some discussion 
- will 45m be enough?


Will you referring the last patch set posted by Eli - (it has some 
pending comments that were not addressed) or Eli is going to post new 
version before the conference?


Or.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg