Hi Ira,
On 12:38 Thu 21 Jan , Ira Weiny wrote:
>
> Here is some test data on a real cluster.
>
> 09:49:10 > ibhosts | wc -l
> 1158
>
> 09:49:28 > ibswitches | wc -l
> 281
>
> 09:44:45 > time ./subnet_discover -n 1 > /dev/null
>
> real0m1.414s
> user0m0.309s
> sys 0m0.244s
>
>
On Sat, Jan 16, 2010 at 2:36 PM, Sasha Khapyorsky wrote:
> On 15:11 Wed 13 Jan , Hal Rosenstock wrote:
>> >
>> > In my tests I found that '8' is more optimal number (the tool works
>> > faster and without drops) than '4' used in OpenSM.
>> >
>> > Of course it would be helpful to run this over
Hey Sasha,
I am finally getting back to this... Sorry.
On Wed, 13 Jan 2010 15:11:44 -0500
Hal Rosenstock wrote:
> Hi Sasha,
>
> On Tue, Jan 12, 2010 at 4:31 AM, Sasha Khapyorsky wrote:
> > Hi Hal,
> >
> > On 08:56 Mon 11 Jan , Hal Rosenstock wrote:
> >> >
> >> > diff --git a/tests/subnet
On 15:11 Wed 13 Jan , Hal Rosenstock wrote:
> >
> > In my tests I found that '8' is more optimal number (the tool works
> > faster and without drops) than '4' used in OpenSM.
> >
> > Of course it would be helpful to run this over bigger cluster than
> > what I have to see that the results are c
Hi Sasha,
On Tue, Jan 12, 2010 at 4:31 AM, Sasha Khapyorsky wrote:
> Hi Hal,
>
> On 08:56 Mon 11 Jan , Hal Rosenstock wrote:
>> >
>> > diff --git a/tests/subnet_discover.c b/tests/subnet_discover.c
>> > index 7f8a85c..42e7aee 100644
>> > --- a/tests/subnet_discover.c
>> > +++ b/tests/subnet_d
Hi Hal,
On 08:56 Mon 11 Jan , Hal Rosenstock wrote:
> >
> > diff --git a/tests/subnet_discover.c b/tests/subnet_discover.c
> > index 7f8a85c..42e7aee 100644
> > --- a/tests/subnet_discover.c
> > +++ b/tests/subnet_discover.c
> > @@ -40,6 +40,7 @@ static struct node *node_array[32 * 1024];
> >
Hi Sasha,
On Mon, Dec 28, 2009 at 4:22 AM, Sasha Khapyorsky wrote:
> Hi Ira,
>
> On 09:35 Mon 21 Dec , Sasha Khapyorsky wrote:
>>
>> An errors are response timeouts. I guess that most of them are due
>> to switches' VL15 overflow (could be verified by VL15Dropped counter
>> evaluation). Will
Hi Ira,
On 09:35 Mon 21 Dec , Sasha Khapyorsky wrote:
>
> An errors are response timeouts. I guess that most of them are due
> to switches' VL15 overflow (could be verified by VL15Dropped counter
> evaluation). Will look at this deeply.
I did a couple of modifications in the code (exact log
On 09:02 Mon 21 Dec , Hal Rosenstock wrote:
> > I wouldn't call it so, it is rather "parallel" than "first" depth or
> > breath - discovery continues at first responding node doesn't matter how
> > was it queried in depth or in breath.
>
> Does anything limit the amount of parallelism ?
Nothi
On Mon, Dec 21, 2009 at 2:35 AM, Sasha Khapyorsky wrote:
> Hi Ira,
>
> On 18:28 Sun 20 Dec , Ira Weiny wrote:
>>
>> Yes, a similar mechanism would work in libibnetdisc. However, it looks like
>> you are doing a depth first search
>
> I wouldn't call it so, it is rather "parallel" than "first
Hi Ira,
On 18:28 Sun 20 Dec , Ira Weiny wrote:
>
> Yes, a similar mechanism would work in libibnetdisc. However, it looks like
> you are doing a depth first search
I wouldn't call it so, it is rather "parallel" than "first" depth or
breath - discovery continues at first responding node doe
'subnet_discover' is simple test utility which implements "non-blocking"
discovery method where mads are sending "in parallel" (unlike the
current implementation of 'ibnetdiscover' and similar to how OpenSM
does). For this a recently discovered node id value is encoded as lower
16 bits of mad tran
12 matches
Mail list logo