RE: [ofw] Re: [PATCH] osmtest - Add OSM_CDECL to main() declaration

2009-10-14 Thread Sean Hefty
>> Sean maintains a separate set of patches applied to IB diags in order to >address issues like the x86 requirement for __cdecl on main(). >> Since OSM_CDECL was already defined in OFED opensm, it did not seem to be a >major concession to prefix main() with it. >> Options: >> 1) utilize OSM_CDECL

[ANNOUNCE] libnes-0.9.0 release

2009-10-14 Thread Tung, Chien Tin
New release of libnes library (0.9.0) is available at: http://www.openfabrics.org/downloads/nes/ sha1sum: 6e9374ea9ace5e052c00aa868eea6793839d1ae8 libnes-0.9.0.tar.gz Changes since last release: Chien Tung (3): libnes: fix warning in src/nes_uverbs.c:756 libnes: add support for sq_

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread David Dillow
On Wed, 2009-10-14 at 15:47 -0700, Roland Dreier wrote: > > > > First it does not make sense for user to set it below 60; therefore, > > > > it is forced to have 60 and above > > > > Why not? A minute seems to be a really long time given the point of > > > these patches is supposed to be fai

[PATCH] opensm/opensm/osm_subnet.c: adjust buffer to ensure a '\n' is printed

2009-10-14 Thread Ira Weiny
From: Ira Weiny Date: Wed, 14 Oct 2009 17:05:53 -0700 Subject: [PATCH] opensm/opensm/osm_subnet.c: adjust buffer to ensure a '\n' is printed When printing cached options strings which fill the print buffer. Adjust the length so the final snprintf(..., "\n"); can succeed. Signed

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: > > > First it does not make sense for user to set it below 60; therefore, > > > it is forced to have 60 and above > > Why not? A minute seems to be a really long time given the point of > > these patches is supposed to be failing over faster. Surely we can tell > >

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
> > > First it does not make sense for user to set it below 60; therefore, > > > it is forced to have 60 and above > > Why not? A minute seems to be a really long time given the point of > > these patches is supposed to be failing over faster. Surely we can tell > > if a path really fail

[PATCHv3] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Hal Rosenstock
Heap memory consumption by the unicast and multicast routing tables can be reduced. This patch is analagous to the previous patch doing this for the unicast routing tables (LFTs). Using valgrind --tool=massif (for heap profiling), there are couple of place ->38.75% (11,206,656B) 0x43267E: osm_sw

Re: [ofa-general][PATCH 4/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Bart Van Assche wrote: + + switch (event->event) { + case IB_EVENT_PORT_ERR: + list_for_each_entry_safe(host, tmp_host, +&srp_dev->dev_list, list) { + if (event->element.port_num == host->port) { +

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: > First it does not make sense for user to set it below 60; therefore, > it is forced to have 60 and above Why not? A minute seems to be a really long time given the point of these patches is supposed to be failing over faster. Surely we can tell if a path really failed

Re: [ofa-general][PATCH 2/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: > > > - wait_for_completion(&target->done); > > > > How do you avoid leaking connection on module unload etc? Don't we have > > to wait for the disconnect to finish somewhere? > > > > - R. > > > Are you talking about cm_id? > I think that we wait because

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: > +static int srp_dev_loss_tmo = 60; I don't think the name needs to be this abbreviated. We don't necessarily need the srp_ prefix, but probably "device_loss_timeout" is much clearer without being too much longer. OK > + > +module_param(srp_dev_loss_tmo, int, 0444

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
> First it does not make sense for user to set it below 60; therefore, > it is forced to have 60 and above Why not? A minute seems to be a really long time given the point of these patches is supposed to be failing over faster. Surely we can tell if a path really failed sooner than 60 seconds

Re: [PATCH] opensm/osm_mcast_mgr.c: Cosmetic changes

2009-10-14 Thread Sasha Khapyorsky
On 16:01 Wed 14 Oct , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.h

Re: [ofa-general][PATCH 2/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
> > > - wait_for_completion(&target->done); > > > > How do you avoid leaking connection on module unload etc? Don't we have > > to wait for the disconnect to finish somewhere? > > > > - R. > > > Are you talking about cm_id? > I think that we wait because we want to reuse cq/qp

Re: [ofa-general][PATCH 2/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: > - wait_for_completion(&target->done); How do you avoid leaking connection on module unload etc? Don't we have to wait for the disconnect to finish somewhere? - R. Are you talking about cm_id? I think that we wait because we want to reuse cq/qp associate with the c

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread Jason Gunthorpe
On Wed, Oct 14, 2009 at 12:33:17PM -0700, David J. Wilder wrote: > > On Wed, 2009-10-14 at 11:01 -0600, Jason Gunthorpe wrote: > > On Wed, Oct 14, 2009 at 09:23:57AM -0700, David J. Wilder wrote: > > > > > This new patch should closely emulate tcp_ipv6.c. when both source and > > > destination sc

[PATCH] opensm/osm_mcast_mgr.c: Cosmetic changes

2009-10-14 Thread Hal Rosenstock
Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_mcast_mgr.c b/opensm/opensm/osm_mcast_mgr.c index 77e0b94..0ee689c 100644 --- a/opensm/opensm/osm_mcast_mgr.c +++ b/opensm/opensm/osm_mcast_mgr.c @@ -351,7 +351,7 @@ static int mcast_mgr_set_mft_block(osm_sm_t * sm, IN osm_switch_t

Re: [PATCH] opensm/osm_subnet.h: Add mgrp_mgid_tbl description

2009-10-14 Thread Sasha Khapyorsky
On 15:51 Wed 14 Oct , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.h

Re: [ofa-general][PATCH 1/4] SRP fail-over faster

2009-10-14 Thread Vu Pham
Roland Dreier wrote: One meta comment: when sending a series of 4 patches, please choose a descriptive subject for each one. We don't want the same headline in the kernel log for all 4 patches. I'll remember next time > - qp_attr.qp_state = IB_QPS_RESET; > - ret = ib_modify_qp(target->qp

Re: [PATCHv2] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Sasha Khapyorsky
On 15:46 Wed 14 Oct , Hal Rosenstock wrote: > > Seems to me that the equivalent (to LFT) is to invoke the table > (re)allocation from osm_mcast_mgr_process/process_mgroups. Is that > what you mean ? Yes, exactly. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma"

[PATCH] opensm/osm_subnet.h: Add mgrp_mgid_tbl description

2009-10-14 Thread Hal Rosenstock
Signed-off-by: Hal Rosenstock --- diff --git a/opensm/include/opensm/osm_subnet.h b/opensm/include/opensm/osm_subnet.h index 9488225..b63c97e 100644 --- a/opensm/include/opensm/osm_subnet.h +++ b/opensm/include/opensm/osm_subnet.h @@ -634,6 +634,10 @@ typedef struct osm_subn { * Th

Re: [PATCHv2] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Hal Rosenstock
On Wed, Oct 14, 2009 at 12:25 PM, Sasha Khapyorsky wrote: > On 11:52 Wed 14 Oct     , Hal Rosenstock wrote: >> >> Heap memory consumption by the unicast and multicast routing tables can be >> reduced. >> >> This patch is analagous to the previous patch doing this for the unicast >> routing tables

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread David J. Wilder
On Wed, 2009-10-14 at 11:01 -0600, Jason Gunthorpe wrote: > On Wed, Oct 14, 2009 at 09:23:57AM -0700, David J. Wilder wrote: > > > This new patch should closely emulate tcp_ipv6.c. when both source and > > destination scope_ids are given with link-local address. > > Maybe like this: > >

Re: [ofa-general][PATCH 3/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
> +static int srp_dev_loss_tmo = 60; I don't think the name needs to be this abbreviated. We don't necessarily need the srp_ prefix, but probably "device_loss_timeout" is much clearer without being too much longer. > + > +module_param(srp_dev_loss_tmo, int, 0444); > +MODULE_PARM_DESC(srp_de

Re: [ofa-general][PATCH 2/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
> -wait_for_completion(&target->done); How do you avoid leaking connection on module unload etc? Don't we have to wait for the disconnect to finish somewhere? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.or

Re: [ofa-general][PATCH 1/4] SRP fail-over faster

2009-10-14 Thread Roland Dreier
One meta comment: when sending a series of 4 patches, please choose a descriptive subject for each one. We don't want the same headline in the kernel log for all 4 patches. > -qp_attr.qp_state = IB_QPS_RESET; > -ret = ib_modify_qp(target->qp, &qp_attr, IB_QP_STATE); > -if (ret) >

Re: [PATCHv2] mlx4: Add a new supported 40 GigE device ID

2009-10-14 Thread Roland Dreier
Thanks, applied. > Resending the patch based on Roland's "for-linus" branch. By the way, just to be clear, my master branch is just the state of Linus's tree at the time I branched. So it's not a particularly good thing to base work on. My "for-next" branch is probably the best if you are worr

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread Jason Gunthorpe
On Wed, Oct 14, 2009 at 10:30:05AM -0700, David J. Wilder wrote: > This looks good. Once concern, it may be obtuse, but if both the src and > dst are link-local addresses should only one need to be scoped? This > patch will required the src to always be scoped when using link local. The TCPv6

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread David J. Wilder
On Wed, 2009-10-14 at 11:01 -0600, Jason Gunthorpe wrote: > On Wed, Oct 14, 2009 at 09:23:57AM -0700, David J. Wilder wrote: > > > This new patch should closely emulate tcp_ipv6.c. when both source and > > destination scope_ids are given with link-local address. > > Maybe like this: > >

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread Jason Gunthorpe
On Wed, Oct 14, 2009 at 09:23:57AM -0700, David J. Wilder wrote: > This new patch should closely emulate tcp_ipv6.c. when both source and > destination scope_ids are given with link-local address. Maybe like this: fl.oif = 0; if (ipv6_addr_type(&src_in->sin6_addr) & IPV6_ADDR_L

Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-14 Thread David J. Wilder
Sean- This patch should fix the behavior of rdma_resolve_addr when using link-local addressing. On Tue, 2009-10-13 at 17:12 -0600, Jason Gunthorpe wrote: > On Tue, Oct 13, 2009 at 03:09:40PM -0700, David J. Wilder wrote: > > Here is a patch to addr6_resolve_remote() to correctly handle link-loc

Re: [PATCHv2] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Sasha Khapyorsky
On 11:52 Wed 14 Oct , Hal Rosenstock wrote: > > Heap memory consumption by the unicast and multicast routing tables can be > reduced. > > This patch is analagous to the previous patch doing this for the unicast > routing tables (LFTs). > > Using valgrind --tool=massif (for heap profiling), t

[PATCHv2] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Hal Rosenstock
Heap memory consumption by the unicast and multicast routing tables can be reduced. This patch is analagous to the previous patch doing this for the unicast routing tables (LFTs). Using valgrind --tool=massif (for heap profiling), there are couple of place ->38.75% (11,206,656B) 0x43267E: osm_sw

Re: [PATCHv2] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Sasha Khapyorsky
On 10:45 Wed 14 Oct , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.h

[PATCHv2] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Hal Rosenstock
Signed-off-by: Hal Rosenstock --- Changes since v1: Removed change of malloc/memset into calloc diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c index bb9ccda..ced076a 100644 --- a/opensm/opensm/osm_ucast_updn.c +++ b/opensm/opensm/osm_ucast_updn.c @@ -2,6 +2,7 @@ *

Re: [PATCH] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Sasha Khapyorsky
On 09:33 Wed 14 Oct , Hal Rosenstock wrote: > On Wed, Oct 14, 2009 at 9:29 AM, Sasha Khapyorsky wrote: > > On 09:14 Wed 14 Oct     , Hal Rosenstock wrote: > >> > >> Also, combine malloc/memset into calloc > > > > Any special reason for malloc/calloc change? > > It's a little more concise and

Re: [PATCH] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Hal Rosenstock
On Wed, Oct 14, 2009 at 10:34 AM, Sasha Khapyorsky wrote: > On 09:33 Wed 14 Oct     , Hal Rosenstock wrote: >> On Wed, Oct 14, 2009 at 9:29 AM, Sasha Khapyorsky >> wrote: >> > On 09:14 Wed 14 Oct     , Hal Rosenstock wrote: >> >> >> >> Also, combine malloc/memset into calloc >> > >> > Any specia

Re: ConnectX saquery question

2009-10-14 Thread Yevgeny Kliteynik
Yevgeny Kliteynik wrote: Hal Rosenstock wrote: On Wed, Oct 14, 2009 at 9:30 AM, Aaron Knister wrote: I searched the lists and can't really find an answer to my question- When I run saquery -s on my cluster with ConnectX HCAs (fw ver 2.6.648) all I get is the following output IsSM ports IsSM

Re: ConnectX saquery question

2009-10-14 Thread Yevgeny Kliteynik
Hal Rosenstock wrote: On Wed, Oct 14, 2009 at 9:30 AM, Aaron Knister wrote: I searched the lists and can't really find an answer to my question- When I run saquery -s on my cluster with ConnectX HCAs (fw ver 2.6.648) all I get is the following output IsSM ports IsSMdisabled ports I thought

Re: ConnectX saquery question

2009-10-14 Thread Hal Rosenstock
On Wed, Oct 14, 2009 at 9:30 AM, Aaron Knister wrote: > I searched the lists and can't really find an answer to my question- > > When I run saquery -s on my cluster with ConnectX HCAs (fw ver > 2.6.648) all I get is the following output > > IsSM ports > > IsSMdisabled ports > > I thought this shou

Re: [PATCH] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Hal Rosenstock
On Wed, Oct 14, 2009 at 9:29 AM, Sasha Khapyorsky wrote: > On 09:14 Wed 14 Oct     , Hal Rosenstock wrote: >> >> Also, combine malloc/memset into calloc > > Any special reason for malloc/calloc change? It's a little more concise and efficient to calloc rather than malloc/memset. Any reason not to

ConnectX saquery question

2009-10-14 Thread Aaron Knister
I searched the lists and can't really find an answer to my question- When I run saquery -s on my cluster with ConnectX HCAs (fw ver 2.6.648) all I get is the following output IsSM ports IsSMdisabled ports I thought this should show ports that are running a subnet manager (there are 2 running on

Re: [PATCH] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Sasha Khapyorsky
On 09:14 Wed 14 Oct , Hal Rosenstock wrote: > > Also, combine malloc/memset into calloc Any special reason for malloc/calloc change? Please don't mix two things in one patch. Sasha > > Signed-off-by: Hal Rosenstock > --- > diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_

[PATCH] opensm/osm_ucast_updn.c: Reduce temporary allocation of cas_per_sw

2009-10-14 Thread Hal Rosenstock
Also, combine malloc/memset into calloc Signed-off-by: Hal Rosenstock --- diff --git a/opensm/opensm/osm_ucast_updn.c b/opensm/opensm/osm_ucast_updn.c index bb9ccda..a2acdbd 100644 --- a/opensm/opensm/osm_ucast_updn.c +++ b/opensm/opensm/osm_ucast_updn.c @@ -2,6 +2,7 @@ * Copyright (c) 2004-2

Re: OpenSM Failover

2009-10-14 Thread Aaron Knister
It will be difficult to get you those logs until the older cluster is decommissioned (which in theory should be soon), but as soon as I am able I will get them too you. On Wed, Oct 14, 2009 at 4:46 AM, Yevgeny Kliteynik wrote: > Aaron, > > Aaron Knister wrote: As I said, the older opens

Re: [PATCH] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Sasha Khapyorsky
Hi Hal, On 07:14 Wed 14 Oct , Hal Rosenstock wrote: > > Heap memory consumption by the unicast and multicast routing tables can be > reduced. > > This patch is analagous to the previous patch doing this for the unicast > routing tables (LFTs). > > Using valgrind --tool=massif (for heap prof

Re: [PATCH] opensm/release notes: Fix typo

2009-10-14 Thread Sasha Khapyorsky
On 07:28 Wed 14 Oct , Hal Rosenstock wrote: > > Signed-off-by: Hal Rosenstock Applied. Thanks. Sasha -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.h

Re: [Fwd: FW: OFED installation question re-AltLinux]

2009-10-14 Thread Vladimir Sokolovsky
Sergey Zhumatiy wrote: Hi Sergey, Try to add '--without-depcheck' to install.pl command: ./install.pl -c ofed.conf --without-depcheck In this case it fails on unistallation procedure. It seems, that some old dependency was forgotten or something like this. If this script can simply print in

[PATCH] opensm/release notes: Fix typo

2009-10-14 Thread Hal Rosenstock
Signed-off-by: Hal Rosenstock --- diff --git a/opensm/doc/opensm_release_notes-3.2.txt b/opensm/doc/opensm_release_notes-3.2.txt index 3356e95..5a83092 100644 --- a/opensm/doc/opensm_release_notes-3.2.txt +++ b/opensm/doc/opensm_release_notes-3.2.txt @@ -271,7 +271,7 @@ information regarding eac

Re: switching the active interface for bonding

2009-10-14 Thread Or Gerlitz
Sumeet Lahorani wrote: We are using OFED 1.4.2 Please note that the bonding driver provided by the latest distros supports IPoIB. So if your distro happen to be RHEL 5.4 (or its OEL 5.4 derivative), or SLES11 you can and should use the distro provided bonding. Moving forward, OTOH customers wo

Re: switching the active interface for bonding

2009-10-14 Thread Or Gerlitz
Sumeet Lahorani wrote: We are [...] trying to simulate the effect of a bonding failover initiated by a switch failure using echo commands in parallel to the /sys/class/net/bond0/bonding/active_slave file on a few of the nodes attached to the switch. Is this an acceptable technique? yes We are

[PATCH] opensm: Reduce heap consumption by multicast routing tables (MFTs)

2009-10-14 Thread Hal Rosenstock
Heap memory consumption by the unicast and multicast routing tables can be reduced. This patch is analagous to the previous patch doing this for the unicast routing tables (LFTs). Using valgrind --tool=massif (for heap profiling), there are couple of place ->38.75% (11,206,656B) 0x43267E: osm_sw

Re: [Fwd: FW: OFED installation question re-AltLinux]

2009-10-14 Thread Sergey Zhumatiy
> Hi Sergey, > Try to add '--without-depcheck' to install.pl command: > > ./install.pl -c ofed.conf --without-depcheck > In this case it fails on unistallation procedure. It seems, that some old dependency was forgotten or something like this. If this script can simply print in one line all pac

Re: [Fwd: FW: OFED installation question re-AltLinux]

2009-10-14 Thread Sergey Zhumatiy
> OFED debian packages (up to OFED 1.4.2) are available here: > > http://pkg-ofed.alioth.debian.org/ > Ok! Thank you very much! I'll try them. But main question is still open: how to install OFED on AltLinux or other distributive (not Fedora/Suse). Older installer simply did compilation on

Re: OpenSM Failover

2009-10-14 Thread Yevgeny Kliteynik
Aaron, Aaron Knister wrote: As I said, the older opensms on the older mellanox model HCAs failsover and failsback instantly. The instant failback is expected, and this is the bug that we're discussing. As for the instant failover - I'll check how the things supposed to work and get back to you.

Re: [Fwd: FW: OFED installation question re-AltLinux]

2009-10-14 Thread Vladimir Sokolovsky
-Original Message- From: Jeffrey Scott [mailto:j...@splitrockpr.com] Sent: Tuesday, October 13, 2009 9:09 AM To: bb...@systemfabricworks.com; 'Sujal Das' Subject: FW: OFED installation Who should this inquiry go to? -Original Message- From: Sergey Zhumatiy [mailto:s...@paralle