On Thu, 2005-10-13 at 18:46, Troy Benjegerdes wrote:
> I'm also attaching part of an opensm log file.
>
> (the full copy is at http://scl.ameslab.gov/~troy/osm-ehca.log )
>
> The IBM galaxy adapters are at:
> Initial path: [0][1][16]
> Initial path: [0][1][13]
>
The OpenSM is just s
Helen> Not in realtime. My observations were made after the fact.
Helen> I supose I can launch another test and watch the cunter in
Helen> realtime if you believe that is necessary?
That might be interesting.
Assuming the HCA continues to work fine, and IPoIB recovers, the only
theor
Roland Dreier wrote:
Thanks, I'll read this over.
What's the motivation here? To shift over to ib_create_send_mad() so
that all the MAD-related DMA mapping stuff is in one place, to make it
easier to fix?
Yes - the motivation is to fix the DMA mapping issue that you pointed out by
changing i
Roland,
>From [EMAIL PROTECTED] Thu Oct 13 16:19:30 2005
>
>Helen> BTW, the state of the IPoIB network seemed fine after the
>Helen> failed test, nd the mthca counters are moving up nicely.
>
>Even on the server on3-ib?
Yes, even on the server on3-ib.
>
>Helen> Do you still think thi
Thanks, I'll read this over.
What's the motivation here? To shift over to ib_create_send_mad() so
that all the MAD-related DMA mapping stuff is in one place, to make it
easier to fix?
- R.
___
openib-general mailing list
openib-general@openib.org
http
This patch changes sa_query to allocate MADs using the ib_create_send_mad()
routine.
The intent behind this change was to eventually change ib_post_send_mad() to
take an ib_send_mad_buf as input, but see the "DMA mapping abuses in MAD layer"
thread. We may want to go with an alternate solution.
Sean> Any preference to pursuing this change or modifying
Sean> ib_post_send_mad to take an ib_mad_send_buf?
I think it's going to be confusing to cast a virtual address to a long
and then ignore the lkey field. So I would go with a new interface
not built on ib_sge.
On the other hand, m
Helen> BTW, the state of the IPoIB network seemed fine after the
Helen> failed test, nd the mthca counters are moving up nicely.
Even on the server on3-ib?
Helen> Do you still think this is a crash of the HCA firmware?
Helen> Should I call Mellanox?
Not if IPoIB is working on the
Thanks. It's strange the copy-paste
gave an extra 1.
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/list
Sean Hefty wrote:
Does anyone else have any other ideas on how to fix this issue?
The current MAD interface requires the user to have code similar to this:
send_buf->sge.addr = dma_map_single(mad_agent->device->dma_device,
buf, buf_size, DMA_
Roland,
Ci
So you are right, it is not a moving target. After repeating
the IOZONE tests several times, I narrowed down the culprit,
server on3-ib. Parallel I/O had made it a bit difficult to
chase it down :-(
BTW, the state of the IPoIB network seemed fine after the failed
test, nd the mth
Roland> My plan is to change the receive handling of IPoIB
Roland> slightly, so that if it can't allocate a new receive
Roland> buffer, it reposts the old buffer and drops the packet it
Roland> just received.
Here's a patch that changes IPoIB to use this scheme. This should be
muc
> http://ozlabs.org/pipermail/linuxppc64-dev/2005-July/004662.html1
delete the '1' from the end of the URL...
- R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visi
Robert> Since the rest of the patch needed to get this working
Robert> isn't applied to either the trunk or the ipath branch yet
Robert> (and since the branch will be going away shortly), can you
Robert> just apply this patch to the trunk when you do the merge?
Sure, no problem.
I am not sure whether something related
to dma_addr_t. Could you please try below patch?
> http://ozlabs.org/pipermail/linuxppc64-dev/2005-July/004662.html1
Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638__
> And here's a patch to ipath to make it work with the uverbs command mask...
Roland,
Since the rest of the patch needed to get this working isn't applied to
either the trunk or the ipath branch yet (and since the branch will be
going away shortly), can you just apply this patch to the trunk when
On Wed, Oct 12, 2005 at 01:04:37PM +0200, IBMEHCA DD wrote:
> I just released the ehca2_0028 which uses svn 3615 on
> https://sourceforge.net/projects/ibmehcad/
> As you might notice the license already has changed to the openib.org
> license.
>
> With 2.6.13 we had the non-issue that our maun f
Helen> It doesn't seem like shrinking the TCP window had helped.
Helen> I captured the Dmesg log from Lustre server and associated
Helen> client reporting IOZONE error.
What is the state of the system after you start seeing the ib0
transmit time out messages? Does IPoIB work at all?
Roland,
It doesn't seem like shrinking the TCP window had helped. I captured the
Dmesg log from Lustre server and associated client reporting IOZONE error.
BTW, this problem is a moving target so it is hard to believe that it
is hardware related(?) BTW, I am using the mellanox DDR switch and HCA
And here's a patch to ipath to make it work with the uverbs command mask...
Index: infiniband/hw/ipath/ib_ipath/ipath_openib.c
===
--- infiniband/hw/ipath/ib_ipath/ipath_openib.c (revision 3758)
+++ infiniband/hw/ipath/ib_ipath/ipath_
OK, here's a new patch that adds a mask of allowed userspace commands
set by the kernel low-level driver.
Thanks, good catch Michael...
- R.
--- include/rdma/ib_user_verbs.h(revision 3707)
+++ include/rdma/ib_user_verbs.h(working copy)
@@ -1,6 +1,7 @@
/*
* Copyright (c) 2005
Michael> What prevents the user from passing e.g. poll cq command
Michael> on mthca device? If that happens, it seems that
Michael> ib_poll_cq will then crash.
Michael> Is there a mask somewhere that lets the device specify
Michael> which uverbs commands are allowed for it?
Hm
Robert> Since qp_type is now in ibv_qp, it probably no longer
Robert> needs to be in mthca_qp. This is just a minor
Robert> optimization.
Yep, I'll make that change too.
- R.
___
openib-general mailing list
openib-general@openib.org
http:/
> @@ -488,6 +489,7 @@ struct ibv_qp {
> uint32_thandle;
> uint32_tqp_num;
> enum ibv_qp_state state;
> + enum ibv_qp_typeqp_type;
>
> pthread_mutex_t mutex;
> pthread_cond_t cond;
Since qp_type is no
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: [RFC] Kernel uverbs changes for PathScale merge
>
> Here are the changes to the kernel part of userspace verbs required to
> support PathScale's driver. I'm now happy with them and ready to
> commit them to the svn trunk and queue them for
Here are the changes to libibverbs required to support PathScale's
driver. Again, I'm happy with them and would just like to get
comments on them before I commit them to svn.
Thanks,
Roland
--- libibverbs/include/infiniband/driver.h (revision 3774)
+++ libibverbs/include/infiniband/driver
Here are the changes to the kernel part of userspace verbs required to
support PathScale's driver. I'm now happy with them and ready to
commit them to the svn trunk and queue them for 2.6.15. This will
allow the PathScale hardware-specific driver to be move to the trunk
as well, although quite a
Roland,
>From [EMAIL PROTECTED] Thu Oct 13 13:53:05 2005
>
>Helen> Roland, Thank you for your response. That fixed my initial
>Helen> buffer allocation failure. After we tuned the Lustre and
>Helen> reran same IOZONE tests again, we got the following
>Helen> problem. Was there a
Helen> Roland, Thank you for your response. That fixed my initial
Helen> buffer allocation failure. After we tuned the Lustre and
Helen> reran same IOZONE tests again, we got the following
Helen> problem. Was there an actual network interrupt? If so, the
Helen> problem is not
Michael S. Tsirkin wrote:
Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
Subject: [PATCH] perftest/rdma_bw; add support for RDMA read and starting PSN
Michael,
The patch adds command line options for RDMA reads and starting PSN. I
used these modifications to
help isolate the RDMA read perform
Roland,
Thank you for your response. That fixed my initial buffer
allocation failure. After we tuned the Lustre and reran
same IOZONE tests again, we got the following problem.
Was there an actual network interrupt? If so, the problem
is not obvious now; the two nodes are pinging over IPoIB.
Pl
On Thu, 13 Oct 2005, Arlin Davis wrote:
> James,
>
> Patch will fix the async error handling and callback mappings. QP/CQ
> error mappings were totally screwed up. Updated TODO list.
>
> -arlin
Committed in revision 3774.
___
openib-general mailing
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: ib0: ipoib_ib_post_receive failed for buf 111 ib0: failed to
> allocate receive buffer
>
> Michael> Yes, it seems that if such an allocation fails IPoIB may
> Michael> never repost the receive buffer. Is that right?
>
> I think
Michael> Yes, it seems that if such an allocation fails IPoIB may
Michael> never repost the receive buffer. Is that right?
I think so.
My plan is to change the receive handling of IPoIB slightly, so that
if it can't allocate a new receive buffer, it reposts the old buffer
and drops the pa
Quoting r. Arlin Davis <[EMAIL PROTECTED]>:
> Subject: [PATCH] perftest/rdma_bw; add support for RDMA read and starting PSN
>
> Michael,
>
> The patch adds command line options for RDMA reads and starting PSN. I
> used these modifications to
> help isolate the RDMA read performance degradation wi
Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> IPoIB's handling of these allocation errors can definitely be improved
Yes, it seems that if such an allocation fails IPoIB may never repost
the receive buffer. Is that right?
--
MST
___
openib-general ma
James,
Patch will fix the async error handling and callback mappings. QP/CQ error
mappings were totally
screwed up. Updated TODO list.
-arlin
Signed-off by: Arlin Davis <[EMAIL PROTECTED]>
Index: dapl/openib/TODO
===
--- dapl/op
I agree with Mike's analysis. But I'd also like to point
out that even
when source compatability is not a requirement, source
familiarity
is. That is, even when recoding is feasible the API should
only
introduce new concepts as required to improve efficiency.
The
shift from socket model to Q
At 03:14 PM 10/12/2005, Caitlin Bestler wrote:
> -Original Message-
> From: [EMAIL PROTECTED]
>
[
mailto:[EMAIL PROTECTED]] On Behalf Of Sean
Hefty
> Sent: Wednesday, October 12, 2005 2:36 PM
> To: Michael Krause
> Cc: openib-general@openib.org
> Subject: Re: [openib-general] [RFC] IB
Michael,
The patch adds command line options for RDMA reads and starting PSN. I used
these modifications to
help isolate the RDMA read performance degradation with 4.6.2 firmware.
-arlin
Signed-off by: Arlin Davis <[EMAIL PROTECTED]>
Index: rdma_bw.c
=
> From: Arlin Davis [mailto:[EMAIL PROTECTED]
> Sent: Thursday, October 13, 2005 9:42 AM
>
> Sean Hefty wrote:
>
> > Arlin Davis wrote:
> >
> >> I just noticed some RDMA read performance issues that seem to be
> >> related to the QP starting sequence number. If I set the starting
> >> sequence to
Sean Hefty wrote:
Arlin Davis wrote:
I just noticed some RDMA read performance issues that seem to be
related to the QP starting sequence number. If I set the starting
sequence to 1 then all is fine but if I set it to 0x1 then it
seems to add ~40us to my 32KB RDMA read operation (polling
Sayantan,
Thanks for the reply. I was just using make in the mvapich-gen2 directory,
that may call the script I don't know. I'll take a look at the doc you suggested
and go through the troule shooting in there.
John
Sayantan Sur wrote:
Hi John,
* On Oct,6 John Partridge<[EMAIL PROTECTED]> wro
On Thu, 2005-10-13 at 03:10, Mohit Katiyar, Noida wrote:
> Hi all,
> If anyone can suggest some good possible solution for migrating from
> Clients FC Switch -> SAN connection
> To
> Clients---> IB network---> SAN Connection
It depends on your storage. There are two c
On Wed, 2005-10-12 at 19:39, Sean Hefty wrote:
> The following patch returns the GID of the IP gateway for non-local
> subnet IP addresses.
>
> Hal, does this change look correct to you? I don't have an easy way
> to test this fully.
Yes, this looks right.
I think the address resolution part c
Hi all,
If anyone can suggest some good possible solution for migrating from
Clients FC Switch -> SAN connection
To
Clients---> IB network---> SAN Connection
The most economical I can think of is
Clients -> IB Switch > IB FC gateway---> FC
Switch--
46 matches
Mail list logo