Hi Sasha,

On Mon, Dec 28, 2009 at 4:22 AM, Sasha Khapyorsky <[email protected]> wrote:
> Hi Ira,
>
> On 09:35 Mon 21 Dec     , Sasha Khapyorsky wrote:
>>
>> An errors are response timeouts. I guess that most of them are due
>> to switches' VL15 overflow (could be verified by VL15Dropped counter
>> evaluation). Will look at this deeply.
>
> I did a couple of modifications in the code (exact log is listed below).
> In particular there are default limitation for number of outstanding MADs
> on the wire and proper tracking for failed (timedout) MADs. I tested
> this where possible. Could you re-run this? Thanks.
>
> Sasha
>

[snip...]

> commit da6aa19840cb2d37e8cd3daa3874b87657a76ddc
> Author: Sasha Khapyorsky <[email protected]>
> Date:   Fri Dec 25 16:24:13 2009 +0200
>
>    tests/subnet_discover: --maxsmps (-n) option
>
>    This implements the limitation of outstanding SMPs on a wire at any
>    one time. --maxsmps=0 means - no limit.
>
>    Signed-off-by: Sasha Khapyorsky <[email protected]>
>
> diff --git a/tests/subnet_discover.c b/tests/subnet_discover.c
> index 7f8a85c..42e7aee 100644
> --- a/tests/subnet_discover.c
> +++ b/tests/subnet_discover.c
> @@ -40,6 +40,7 @@ static struct node *node_array[32 * 1024];
>  static unsigned node_count = 0;
>  static unsigned trid_cnt = 0;
>  static unsigned outstanding = 0;
> +static unsigned max_outstanding = 8;

Any reason why this default is different from the one which OpenSM
uses ? Seems to me it should be the same (or less).

-- Hal

>  static unsigned timeout = 100;
>  static unsigned retries = 3;
>  static unsigned verbose = 0;

[snip...]
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to