On Thu, 6 Jul 2006, Benjamin Herrenschmidt wrote:
| On Thu, 2006-07-06 at 16:34 -0700, Bryan O'Sullivan wrote:
| > On Fri, 2006-07-07 at 08:37 +1000, Benjamin Herrenschmidt wrote:
| >
| > > > +int ipath_unordered_wc(void)
| > > > +{
| > > > + return 1;
| > > > +}
| > >
| > > How is the abo
On Wed, Jul 12, 2006 at 05:40:13PM -0700, David Miller wrote:
> From: Roland Dreier <[EMAIL PROTECTED]>
> Date: Wed, 12 Jul 2006 17:11:26 -0700
>
> > A cleaner solution would be to make the dma_ API really use the device
> > it's passed anyway, and allow drivers to override the standard PCI
> > st
Ravinandan Arakali wrote:
> Also, I am trying to run some of the iwarp bandwidth/latency tests
> (available under directory perftest).
> The first thing to do here is to run opensm. When I run opensm (with debug
You do not need opensm for iwarp.
You will be able to use only rdma_bw and rdma_lat fro
>I'm concerned about how rdma_cm abstracts HCAs. It looks like I can use
>the src_addr argument to rdma_resolve_addr() to select which IP
>address/HCA (assuming one IP per HCA), but how can I enumerate the
>available HCAs?
The HCA / RDMA device abstraction is there for device hotplug, but the ver
>> I don't know if this is an HCA firmware issues, switch issue, or openSM
>issue.
>> I don't think it's related to my changes or osmtest at this point.
>
>I'll see if I can reproduce this tomorrow.
>
>Also, can you send me the guid2lid files from the 3 SMs ?
I'll send this tomorrow. Before reloa
Ravinandan,
On Wed, 2006-07-12 at 19:39, Ravinandan Arakali wrote:
> Also, I am trying to run some of the iwarp bandwidth/latency tests
> (available under directory perftest).
> The first thing to do here is to run opensm.
You don't need to run OpenSM for iWARP.
> When I run opensm (with debug
On Wed, 2006-07-12 at 18:36, Sean Hefty wrote:
> Hal Rosenstock wrote:
> > With the default sminfo_polling_timeout of 10 seconds and default
> > polling_retry_number of 4, so the total handoff time should be around 40
> > seconds. I just did that experiment with 2 SMs and saw that as well.
>
> Oka
On Wed, 12 Jul 2006 13:45:12 -0700
Roland Dreier <[EMAIL PROTECTED]> wrote:
> Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to
> do internal locking. This is a nasty trap for code that might call
> idr functions from different contexts; for example, it seems perfectly
> reaso
Hi Pat,
On 15:14 Tue 11 Jul , Patrick Mullaney wrote:
> This will avoid an invalid warning about service level value if sl=0 is
> used in the partition config file.
Yes, this is wrong warning, but original goal of this check was to catch
non-numeric string. Think something like this may be be
From: Roland Dreier <[EMAIL PROTECTED]>
Date: Wed, 12 Jul 2006 17:11:26 -0700
> A cleaner solution would be to make the dma_ API really use the device
> it's passed anyway, and allow drivers to override the standard PCI
> stuff nicely. But that would be major surgery, I guess.
Clean but expensiv
> One solution is to change the IB device driver interface so that
> kernel virtual addresses are passed to the IB device driver and
> the device driver is responsible for calling dma_map_single(), etc.
> I believe this will be unacceptable to the OpenFabrics community
Actually it's worse than
Sean Hefty wrote:
> Andrew Friedley wrote:
>
>> I'm trying to understand how the ibverbs multicast API works, but I'm
>> not sure how multicast groups are created. I understand that
>> ibv_attach_mcast() and ibv_detach_mcast() are used to leave/join a
>> particular multicast group, but IB arch
From: Ralph Campbell <[EMAIL PROTECTED]>
Date: Wed, 12 Jul 2006 16:29:27 -0700
> Currently, the ib_ipath driver requires that the mapping be
> one-to-one since there is no practical way to reverse IOMMU
> mappings.
You can maintain a hash table that maps DMA addresses back to kernel
mappings. De
Also, I am trying to run some of the iwarp bandwidth/latency tests
(available under directory perftest).
The first thing to do here is to run opensm. When I run opensm (with debug
level 10), I get the following error. Any idea what needs to be done to get
this working ?
openfab2:/tmp/ib/src/usersp
I have been looking at how to eliminate the bus_to_virt() and
phys_to_virt() calls used by the ib_ipath driver.
I am looking for suggestions on how to proceed.
The current IB core to IB device driver interface relies
on a kernel module being able to call ib_get_dma_mr() to allocate
a memory region
Hi Mike,
On 7/12/06, Michael Krause <[EMAIL PROTECTED]> wrote:
>
> At 09:48 AM 7/12/2006, Jeff Broughton wrote:
>
>> Modifying the sockets API is just defining yet another RDMA API, and we have
>> so many already
>
> I disagree. This effort has distilled the API to basically one for RDMA
> de
Hal Rosenstock wrote:
> With the default sminfo_polling_timeout of 10 seconds and default
> polling_retry_number of 4, so the total handoff time should be around 40
> seconds. I just did that experiment with 2 SMs and saw that as well.
Okay - I narrowed down the test case to something reproducible
At 09:48 AM 7/12/2006, Jeff Broughton wrote:
Mike,
The whole purpose of SDP
is to make sockets go faster without having to have the applications
modified. This is what the customers want. I've heard this
time and time again, across a wide spectrum of
customers.
I am well aware of this. Howe
On Wed, 2006-07-12 at 13:58, Sean Hefty wrote:
> >> I was starting / stopping openSM on different systems soon before running
> >> the
> >> tests.
> >
> >Not sure I quite understand the sequencing.
>
> I was being somewhat random, just trying to stress things.
> How quickly will one SM take ov
Andrew Friedley wrote:
> I'm trying to understand how the ibverbs multicast API works, but I'm
> not sure how multicast groups are created. I understand that
> ibv_attach_mcast() and ibv_detach_mcast() are used to leave/join a
> particular multicast group, but IB architecture spec indicates a g
* Roland Dreier <[EMAIL PROTECTED]> wrote:
> Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to
> do internal locking. This is a nasty trap for code that might call
> idr functions from different contexts; for example, it seems perfectly
> reasonable to call idr_get_new() f
I'm trying to understand how the ibverbs multicast API works, but I'm
not sure how multicast groups are created. I understand that
ibv_attach_mcast() and ibv_detach_mcast() are used to leave/join a
particular multicast group, but IB architecture spec indicates a group
must be created first. H
John Partridge wrote:
>
>I installed the dapl rpm. I do have libdat.so.1 but I also expect a
>symlink to libdat.so which does not exist (Intel MPI appears to need it)
>
>I also noticed that the dat.conf points to
>/usr/local/ofed/lib/libdaplcma.so but there is no symlink in the
>/usr/local/ofed/li
Michael S. Tsirkin wrote:
>Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
>
>
>>The latest uDAPL from the trunk and uCMA set option support is sufficient.
>>
>>
>
>Which options do you set? Retry/timeout or path as well?
>
>
>
Just retry/timeout.
Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> The latest uDAPL from the trunk and uCMA set option support is sufficient.
Which options do you set? Retry/timeout or path as well?
--
MST
___
openib-general mailing list
openib-general@openib.org
http://o
Currently, the code in lib/idr.c uses a bare spin_lock(&idp->lock) to
do internal locking. This is a nasty trap for code that might call
idr functions from different contexts; for example, it seems perfectly
reasonable to call idr_get_new() from process context and idr_remove()
from interrupt cont
Tziporet Koren wrote:
>
> • Core:
>
> – Set options in CMA & uCMA (needed for Intel MPI)
>
> – HCA fatal - full flow support
>
> – Huge pages support
>
>
> • uDAPL:
>
> – Scalability features needed for Intel MPI – take from trunk
>
> • Arlin & James – please reply if there are more features neede
OpenSM/SA: Minor reordering of SA rcv_process functions to be more
consistent
Also, some cosmetic changes
Signed-off-by: Hal Rosenstock <[EMAIL PROTECTED]>
Index: opensm/osm_sa_guidinfo_record.c
===
--- opensm/osm_sa_guidinfo_record.
According to IB Spec either Verbs consumer or CI can initiate the path
migration, but it doesn't describe the poliy who should initiate the
path migration and doesn't cleary define who should change the state in
each case.
If the hca (mthca0 - MT25208) supports automatic path migration when
> I see now that the link pointed by drivers/infiniband/ulp/ipoib/Kconfig
> and Documentation/infiniband/ipoib is broken, i can find many copies of
> it eg http://mirror.switch.ch/ftp/doc/ietf/ipoib/ipoib-charter.txt but
> the original one http://www.ietf.org/html.charters/ipoib-charter.html
To me this schedule seems too short to expect real new features like
> - HCA fatal - full flow support
> * IPoIB
> - Bonding - for high availability
that have had no work done (in public at least) yet to be integrated.
If 1.1 is going to go to code freeze in 19 days t
Tziporet,
> - Based on 0.97 (we will not move to 0.98 since we tested it and
> found it is less stable then 0.97)
Could you please indicate which version of 0.9.8 you tested and what
are the exact problems you have faced.
Please note that 0.9.8 has not been formally released yet. What is
> i agree that the IDR subsystem should be irq-safe if GFP_ATOMIC is
> passed in. So the _irqsave()/_irqrestore() fix should be done.
OK, I will send the idr change to Andrew.
> But i also think that you should avoid using GFP_ATOMIC for any sort of
> reliable IO path and push as much work
> this does not have to be a false positive!
> It is not legal to take ANY non-hardirq safe lock after having taken a
> lock that's used in hardirq context.
> (having said that the skb_queue_tail lock needs a special treatment for
> some real false positives; Linus merged that already)
...
> Avoid bogus out out memory errors: fix sa_query to actually pass gfp_mask
> supplied by the user to idr_pre_get.
Yes, this looks right to me.
- R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-
>> I was starting / stopping openSM on different systems soon before running the
>> tests.
>
>Not sure I quite understand the sequencing.
I was being somewhat random, just trying to stress things. How quickly will one
SM take over for another after one dies?
>Can you run with -V and send me the
I haven't yet tried with a fresh installation. But I did notice that,
compared to my current tree, several files under libamso and couple of files
under librdma have been removed by Steve Wise. I did the same change to my
tree and rebuilt librdmacm.so and amso.so but still see the same crash.
Ravi
On Wed, 2006-07-12 at 12:41, Sean Hefty wrote:
> Hal Rosenstock wrote:
> >>>and running multiple copies of opensm on different systems.
> >>
> >>Not sure what that would fail. The other SMs should be standbys. I can't
> >>think of what would fail in osmtest off the top of my head but haven't
> >>tr
Ravinandan,
Do you still see the rping crash?
Thanks,
Pradipta Kumar.
Ravinandan Arakali wrote:
> Pradipta,
> Okay, thanks.. Initially, I was not sure since I don't remember non-zero
> values in /proc/krping. When I re-ran the krping test, I see following
> output
> openfab2:~ # cat /proc/krpin
Mike,
The whole purpose of SDP is to make
sockets go faster without having to have the applications modified. This
is what the customers want. I've heard this time and time again, across a
wide spectrum of customers.
Modifying the sockets API is just defining yet another
RDMA API, and
Hal Rosenstock wrote:
>>>and running multiple copies of opensm on different systems.
>>
>>Not sure what that would fail. The other SMs should be standbys. I can't
>>think of what would fail in osmtest off the top of my head but haven't
>>tried this yet but am now about to.
I was starting / stoppin
This is the latest svn ehca code, 2.6.17 kernel.
Can I also request that the EHCA driver print out what PHYP firmware it
is known to work with, just like mthca prints out a warning if the
mellanox card firmware is out of date? And while I'm asking about PHYP,
what version are the ehca developer
James Lentini wrote:
>
> On Tue, 11 Jul 2006, John Partridge wrote:
>
>
>>The resulting build from your last patch has been installed and we
>>are in the process of DAPL tests now. I do know that the libdat
>>works with Intel MPI (although we had to manually create a symlink
>>from libdat.so.
Michael S. Tsirkin wrote:
> Yes, this is true for users that pass GFP_ATOMIC to sa_query, at least. But
> might not be so for other users: send_mad in sa_query actually gets gfp_flags
> parameter, but for some reason it does not pass it to idr_pre_get, which means
> even sa query done with GFP_KE
At 12:59 AM 7/12/2006, Tziporet Koren wrote:
Scott Weitzenkamp (sweitzen)
wrote:
> For SDP, I would like to see "improved stability" (maybe
you have this
> in mind under "beta quality"), also how about "AIO
support"? The rest
> of the list looks good.
>
Yes - beta quality means improved stab
On Tue, 11 Jul 2006, John Partridge wrote:
> James Lentini wrote:
> > This is included on IA64 and PPC systems. Since we have not done testing on
> > IA64 or PPC, I'm certain that this was a contribution for a IA64 or PPC DAPL
> > user. For that reason, I'm not certain why the asm/system.h was i
On Tue, 11 Jul 2006, John Partridge wrote:
> The resulting build from your last patch has been installed and we
> are in the process of DAPL tests now. I do know that the libdat
> works with Intel MPI (although we had to manually create a symlink
> from libdat.so.1 to libdat.so - should this
Hi Hal,
Yea, I think it should be ULONG_MAX and your local variable sl should
also be an unsigned long. Its precision will get truncated on the
assignment to conf->sl but no problem due to the range being
limited(0-15).
Pat
>>> Hal Rosenstock <[EMAIL PROTECTED]> 07/12/06 7:08 AM >>>
Hi again Pa
On Wed, 2006-07-12 at 09:13, yipeeyipee yipeeyipee wrote:
> --- Hal Rosenstock <[EMAIL PROTECTED]> wrote:
>
> [snip]
> Should this IS_SM bit in port attributes be supported
> in the switch hardware?
If you are running an SM on your switch, the IS_SM bit would be on for
port 0. Otherwise not.
> >
Hi again Pat,
On Tue, 2006-07-11 at 17:14, Patrick Mullaney wrote:
> This will avoid an invalid warning about service level value if sl=0 is
> used in the partition config file. Can you include something like it in
> your original patch?
>
> Thanks.
> Pat
>
>
--- Hal Rosenstock <[EMAIL PROTECTED]> wrote:
[snip]
Should this IS_SM bit in port attributes be supported
in the switch hardware?
> Yes (I'm pretty sure). The user_mad API has not
> changed in quite some
> time now. What ABI version is 2.6.14 ?
I don't know where to check this.
_
On Tue, 2006-07-11 at 09:27, yipee wrote:
> Hal Rosenstock voltaire.com> writes:
> [snip]
> > It's not the setting which is failing. You are likely not using an SM
> > which supports this (it is an enhanced capability defined in a 1.2
> > erratum). Are you running a recent OpenSM or something els
On Wed, 2006-07-12 at 06:51, Tziporet Koren wrote:
> Hal Rosenstock wrote:
> >> • OSM:
> >>
> >> –Partition Manager (Pkey)
> >>
> >
> > Also, primitive QoS support.
> >
> >
> >> –Pre-computed routing load from file
> >>
> >
> > Also, diags:
> >
> > Add saquery
On Wed, 2006-07-12 at 07:13, Hal Rosenstock wrote:
> > and running multiple copies of opensm on different systems.
>
> Not sure what that would fail. The other SMs should be standbys. I can't
> think of what would fail in osmtest off the top of my head but haven't
> tried this yet but am now about
On Tue, 2006-07-11 at 18:55, Sean Hefty wrote:
> Other issues that I've been running into appear to be related to a
> combination
> of timing issues running the tests too quickly after starting opensm
Yes, if opensm has not gotten far enough, osmtest will fail. The SM must
initialize the subnet
On Tue, 2006-07-11 at 16:07, Sean Hefty wrote:
[snip...]
> >> After further testing, this patch breaks osmtest as a result of modifying
> >> the
> >> TID for a SEND.
> >
> >What does the test do?
>
> Hmm... I just reran the test, and it worked now. Now I'm really confused as
> to
> what the pr
Quoting r. Ingo Molnar <[EMAIL PROTECTED]>:
> But i also think that you should avoid using GFP_ATOMIC for any sort of
> reliable IO path and push as much work into process context as possible.
> Is it acceptable for your infiniband IO model to fail with -ENOMEM if
> GFP_ATOMIC happens to fail, a
Hal Rosenstock wrote:
>> • OSM:
>>
>> –Partition Manager (Pkey)
>>
>
> Also, primitive QoS support.
>
>
>> –Pre-computed routing load from file
>>
>
> Also, diags:
>
> Add saquery tool
>
> Enhancement to ibnetdiscover tool with grouping function
>
OK - I wi
Hi Or,
On Wed, 2006-07-12 at 03:13, Or Gerlitz wrote:
> Hi Hal,
>
> I think you were involved in setting/updating the pointers from the
> IPoIB kernel docs to the IETF website...
> I see now that the link pointed by drivers/infiniband/ulp/ipoib/Kconfig
> and Documentation/infiniband/ipoib is bro
On Wed, 2006-07-12 at 01:53, Tziporet Koren wrote:
> Hi All,
>
>
>
> I wish to start the release process of OFED 1.1.
>
> I would like that we will have a meeting next Monday to review this
> proposal of the release features and schedule.
>
> If possible I wish to move the meeting hour from 9
Hi Pat,
On Tue, 2006-07-11 at 17:14, Patrick Mullaney wrote:
> This will avoid an invalid warning about service level value if sl=0 is
> used in the partition config file. Can you include something like it in
> your original patch?
Yes, SL 0 is valid so this warning should not be output. I will i
* Roland Dreier <[EMAIL PROTECTED]> wrote:
> Hmm, good point.
>
> It sort of seems to me like the idr interfaces are broken by design.
[...]
> So, ugh... maybe the best thing to do is change lib/idr.c to use
> spin_lock_irqsave() internally?
i agree that the IDR subsystem should be irq-safe i
> The problem with that is then there are two libraries to maintain,
> fixes have to be merged twice, etc. It's much better to follow an
> evolutionary path.
Thanks for the feedback. OK, I will make the changes and re-submit.
___
openib-general maili
Quoting r. Pradipta Kumar Banerjee <[EMAIL PROTECTED]>:
> Subject: [PATCH 0/2] perftest: enhancement to rdma_bw to allow use of RDMA CM
>
> This patchset allows rdma_bw to use RDMA CM. This patch tries to address the
> comments from Michael Tsirkin on the earlier posted patch by Steve Wise.
> See
Scott Weitzenkamp (sweitzen) wrote:
> For SDP, I would like to see "improved stability" (maybe you have this
> in mind under "beta quality"), also how about "AIO support"? The rest
> of the list looks good.
>
Yes - beta quality means improved stability.
AIO is not planed for 1.1 (schedule issu
Or Gerlitz wrote:
>> • IPoIB
>> – Bonding - for high availability
>
> Can you point me to the person/company which is working on this? I've
> started to look on it as well so we can exchange ideas and join forces.
Vlad and Eitan from Mellanox are working on this
>
>
>> •
For SDP, I would like to see "improved stability" (maybe
you have this in mind under "beta quality"), also how about "AIO support"?
The rest of the list looks good.
Scott
Weitzenkamp
SQA and Release
Manager
Server Virtualization
Business Unit
Cisco Systems
From: [EMAIL PROTECT
Tziporet Koren wrote:
> I wish to start the release process of OFED 1.1.
> • IPoIB
> – Bonding - for high availability
Can you point me to the person/company which is working on this? I've
started to look on it as well so we can exchange ideas and join forces.
> • iSER
Hi Hal,
I think you were involved in setting/updating the pointers from the
IPoIB kernel docs to the IETF website...
I see now that the link pointed by drivers/infiniband/ulp/ipoib/Kconfig
and Documentation/infiniband/ipoib is broken, i can find many copies of
it eg http://mirror.switch.ch/ftp/
69 matches
Mail list logo