date:20060927

[openib-general] is IB/cm: Randomize starting comm ID fix missing in OFED 1.1 ?!

2006-09-27 Thread Or Gerlitz

Michael,

I understand that OFED 1.1 is based on the IB code of 2.6.18-rc6, however, 
this patch which was pushed to 2.6.19-rc1 solves a real problem which was 
reported from a Lustre field install and can be easily reproducable in the lab. 

Can it go into rc7?

http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=f06d26537559113207e4b73af6a22eaa5c5e9dc3

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Moshe Kazir <[EMAIL PROTECTED]>:
> The mstflint operated in the "classic way"  in OFED-1.1 is not working
> on PPC64 sles10  !!!

I consider the classic way to be
-d /sys/class/infiniband/mthca0/device/resource0

It does seem a bit verbse now that you mention this - would
a shortcut to allow just -d mthca0 help a lot?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] backporting fixes

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> Subject: Re: backporting fixes
> 
>  > Now that  2.6.18 (with an additional patch) I looked at backporting 
> bugfixes to
>  > older kernels.  The main problem I see is that the neighbour destructor
>  > interface change is not in 2.6.16, so IPoIB crashes randomly.
>  > 
>  > So approaches are
>  > - Try to push the change into 2.6.16 by netdev
>  > - Use the all-neighbour list as done by ofed
>  > - Abandon the whole project
>  > 
>  > Ideas?
> 
> Unfortunately I don't think this bug is very amenable to being fixed
> in a 2.6.16/-stable tree.  So the third solution is probably the best
> we can do at this point.

OK. How about 2.6.17.y?
I'm somewhat confused whether someone is still maintaining these.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RDMA CM callback status

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: RDMA CM callback status
> 
> Sean Hefty wrote:
> >>1. Should I even be looking at event->status or does the event type tell me
> >>  everything I need to know?  I've had a report that the assertion
> >>  (event->status != 0) is failing on RDMA_CM_EVENT_ROUTE_ERROR.
> > 
> > It sounds like (and looks like from reading the code) that you've hit a bug 
> > with
> > the ROUTE_ERROR event.  The failure status isn't being propagated up to the
> > user.
> 
> I've committed a patch to svn which will set the event status correctly when 
> a 
> route error occurs.

Can you post a patch pls?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Compile warnings (cross build)

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> -  (u64)c2dev->rep_vq.host_dma);
> +  (unsigned long long) c2dev->rep_vq.host_dma);

BTW, is there some printk format to print u64 type?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] 2.6.18 kernel support in the main trunk.

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Bryan O'Sullivan <[EMAIL PROTECTED]>:
> Subject: Re: 2.6.18 kernel support in the main trunk.
> 
> On Wed, 2006-09-27 at 14:24 -0700, Roland Dreier wrote:
> > Do we have to keep the kernel modules in svn limping along?  As time
> > goes on, I have less and less patience for double maintenance.
> 
> I'm still all in favour of nuking them...

Me too.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Sean Hefty <[EMAIL PROTECTED]>:
> Subject: Re: [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ
> 
> Sean Hefty wrote:
> > Currently a DREP is only sent in response to a DREQ if a connection
> > has been found matching the DREQ, and it is in the proper state.  Once
> > a DREP is sent, the local connection moves into timewait.  Duplicate
> > DREQs received while in this state result in re-sending the DREP.
> > 
> > However, it's likely that the local connection will enter and exit
> > timewait before the remote side times out a lost DREP and resends a DREQ.
> > There are a couple possible solutions to this.  One is to increase how
> > long a connection remains in timewait, by multiplying its wait time by
> > max_cm_retries.  This can greatly increase the timewait state before a QP
> > can be re-used when CM messages are not lost.
> > 
> > An alternative is to send a DREP in response to a DREQ, even if a local
> > connection is not found, which is what this patch does.
> 
> If there are no objections, I will commit this patch to svn, and submit for 
> inclusion upstream.

I'm OK with this change.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ

2006-09-27 Thread Or Gerlitz

Sean Hefty wrote:
> Sean Hefty wrote:

>> An alternative is to send a DREP in response to a DREQ, even if a local
>> connection is not found, which is what this patch does.

> If there are no objections, I will commit this patch to svn, and submit for 
> inclusion upstream.

Sean,

My understanding is that without this patch the side that sends the DREQ 
would do few DREQ resends as of the "firsts" DREPs being lost and no 
DREPs sent once the id at the peer side left the timewait state, correct?

Arlin,

Can you please share what were the implications with intel MPI running a 
64 nodes (128 ranks?) job? was the issue here just making the ***job 
termination time*** bigger?

I don't have an objection for merging it, i just think it can be nice if 
we understand better what problem this patch comes to solve in terms of 
this use case that has driven the fix.

Or.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib: NAPI

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Roland Dreier <[EMAIL PROTECTED]>:
> If this value makes a real difference in practice, we can make it
> tunable but I would like to see some hard benchmarks that show it
> making a big difference one way or another.  But we have too many
> knobs as it is so I'm inclined to just pick a value that works OK.

Fair enough, let's start simple.
BTW, are you going to post the rewritten NAPI patch
for testing soon?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Tseng-Hui (Frank) Lin <[EMAIL PROTECTED]>:
> On some ppc64 and x86_64 machines, the I/O memory mapped by mmap() is
> not accessable (return 0x) unless the kernel code (usually the
> device driver) does an ioremap. This is why mmap resource0 does not work
> on these machines.

Let's be exact here: ioremap *only* does not work if driver is not loaded.
Is that right? If yes, the typical and safe thing for the user is to have
driver loaded and do
-d /sys/class/infiniband/mthca0/device/resource0
without playing with lspci and other low level hacks,
and I would rather you told users to do *that*
(by the way, would it help if you could use "-d mthca0")?

> There is no way I am aware of can do ioremap from
> user space code like mstflint. The only thing I can think of is to fall
> back to use the config space file in /proc/bus/pci/.

How about write/read to/from resource0? Does that work?

> The (big) patch I made checks if the faster way (mmap resource0) works.
> It it doesn't, the patch tries other slower ways and use the fastest
> working way it can find. That's all the patch does. It does not make big
> fix. It just save the users trouble of trying all possible ways of
> opening a devices manually.

I don't reject that approach, not on principle.
This is absolutely something we can consider for trunk.
But let's fist try to make memory access work, even if
it's not with mmap.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD

2006-09-27 Thread Moshe Kazir

Michael wrote :
> Since I don't consider this a critical fix (there's no reason driver 
> won't go up, and if it does not, there's a simple workaround by 

Michael , 
The mstflint operated in the "classic way"  in OFED-1.1 is not working
on PPC64 sles10  !!!

Telling the customer to use a workaround (open /proc...) if there
platform is PPC64 is not nice !!   

We need to fix the bug in the code !

Frank wrote :
>  The patch can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1.
CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for
others

This define keeps the program from been damaged when running on other
platforms.

Can you have a look at the code once more and write how you want us (me
and Frank ) to refine it ?

It's  o.k. for us if the fix will be enter to the OFED-1.2 but we need
it in the code ! 

Moshe

Moshe Katzir   |  +972-9971-8639 (o)   |   +972-52-860-6042  (m)

Voltaire - The Grid Backbone

 www.voltaire.com

-Original Message-
From: Tseng-Hui (Frank) Lin [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 27, 2006 7:46 PM
To: Michael S. Tsirkin
Cc: Moshe Kazir; Tseng-hui Lin; openib-general@openib.org
Subject: Re: [openib-general] FW: Mstflint - not working on ppc64
andwhendriver is not loaded on AMD

On Wed, 2006-09-27 at 18:19 +0300, Michael S. Tsirkin wrote:
> Quoting r. Moshe Kazir <[EMAIL PROTECTED]>:
> > Subject: FW: [openib-general] Mstflint - not working on ppc64 and 
> > whendriver is not loaded on AMD
> > 
> > Michael,
> >  
> > Frank new version was tested once more in Voltaire and is working 
> > o.k. . I tested  `./mstflint -d  q`  when drivers are 
> > loaded and when drivers are not loaded. in all cases it worked o.k.
> 
> Thanks for testing, but I'd like to get a handle on what's going on 
> first.
> 
> First, I'm pretty sure when driver is loaded things work OK on all 
> systems. When driver is not loaded - could you please answer whether 
> using /sys/bus/pci/devices/\:03\:00.0/resource0
> works for you (on systems that have resource0)?
> 

It doesn't work.

> >  
> > Test was ferformed on the following environments :
> >  
> > -IBM js21 ppc64 sles10 PCI-E
> > -IBM js21 ppc64 sles9 sp3 PCI-E
> > -IBM hs21 em64t redhat as 4 u3 PCI-E
> > -IBM hs21 em64t sles 9 sp3 PCI-E
> > -x86_64 sles10  PCI-E
> > -MAC ppc64 sles10 PCI-X
> > -MAC ppc64 sles10 PCI-E
> >
> > Please consider inserting the patch to OFED .
> >  
> > Moshe
> 
> Since I don't consider this a critical fix (there's no reason driver 
> won't go up, and if it does not, there's a simple workaround by 
> specifying the /proc interface, that is slower but works), I don't 
> think this should go into OFED 1.1.
> 
> Unfortunately, I never got a small bugfix patch against the latest 
> mstflint - the patch I saw posted touches all kind of things all over 
> the code - so I can't insert it in trunk, either.
> 

I agree this is not critical. The patch changes nothing but the way of
opening the device.

On some ppc64 and x86_64 machines, the I/O memory mapped by mmap() is
not accessable (return 0x) unless the kernel code (usually the
device driver) does an ioremap. This is why mmap resource0 does not work
on these machines. There is no way I am aware of can do ioremap from
user space code like mstflint. The only thing I can think of is to fall
back to use the config space file in /proc/bus/pci/.

The (big) patch I made checks if the faster way (mmap resource0) works.
It it doesn't, the patch tries other slower ways and use the fastest
working way it can find. That's all the patch does. It does not make big
fix. It just save the users trouble of trying all possible ways of
opening a devices manually.

I understand applying big patch is risky unless it can be throughly
tested. Unfortunately, no one has all the machines to test the patch.
Moshe and I have tested the patch on Power MAC, Squadrons, JS20, and
JS21 (almost all living ppc64 machines) as well as a few x86_64
machines. We believe this patch is safe for these machines. The patch
can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1.
CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for
others. We can enable this patch on other machines when people who have
these machines tested the patch.

I agree this is no a critical patch, but it is a useful one. Moreover,
it is well tested on the machines with the patch enabled and change
nothing on the machines with the patch disabled. I believe this is a
safe patch. Please re-consider adding it. Thanks.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] oops after rmmod ib_cm when stopping iSER

2006-09-27 Thread Erez Zilber

Sean Hefty wrote:
> Erez Zilber wrote:
>> When stopping iSER, we run 'modprobe -r ib_iser'. Then, we see an 
>> oops (below). In order to check which module caused that oops, I 
>> replaced the 'modprobe -r' call with rmmod for each module:
>>
>> rmmod ib_iser
>> rmmod libiscsi
>> rmmod scsi_transport_iscsi
>> rmmod rdma_cm
>> rmmod ib_addr
>> rmmod ib_cm
>>
>> If I wait a few seconds before the removal of ib_cm, everything is ok.
>
> Thanks for the info.  My guess is that the cm_id's are not taking a 
> reference on the cm devices, which is allowing the module unload to 
> proceed while cm_id's still remain in timewait.  I will look at this 
> in more detail and work on a patch.  How reproducible is this?
>
> - Sean
100% reproducible. It happens every time.

Erez



 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] osm_vendor_mlx_sa.c - missing status on timeout SA query

2006-09-27 Thread Eitan Zahavi

Hi Hal

Similar to the bug discovered by Yevgeny on the osm_vendor_ibumad_sa.c
the very same bug happens on osm_vendor_mlx_sa.c which fails osmtest.
The issue is that the status of the result of the query is not returned 
as the result of the SA query.

Eitan

Signed-off-by:  Eitan Zahavi <[EMAIL PROTECTED]>

Index: libvendor/osm_vendor_mlx_sa.c
===
--- libvendor/osm_vendor_mlx_sa.c   (revision 9642)
+++ libvendor/osm_vendor_mlx_sa.c   (working copy)
@@ -219,7 +219,8 @@ __osmv_sa_mad_err_cb(
 
   query_res.status = IB_TIMEOUT;
   query_res.result_cnt = 0;
-
+  query_res.p_result_madw->status = IB_TIMEOUT;
+  p_madw->status = IB_TIMEOUT;
   query_res.query_type = p_query_req_copy->query_type;
 
   p_query_req_copy->pfn_query_cb( &query_res );
@@ -611,6 +612,7 @@ __osmv_send_sa_req(
  "Waiting for async event.\n" );
 cl_event_wait_on( &p_bind->sync_event, EVENT_NO_TIMEOUT, FALSE );
 cl_event_reset(&p_bind->sync_event);
+status = p_madw->status;
   }
 
  Exit:


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Compile warnings (cross build)

2006-09-27 Thread Tom Tucker

This all looks good to me.

Thanks,
Tom


On 9/27/06 4:47 PM, "Roland Dreier" <[EMAIL PROTECTED]> wrote:

> OK, this is what I just came up with to fix these.
> 
> Look OK to you Tom?
> 
> diff --git a/drivers/infiniband/hw/amso1100/c2_ae.c
> b/drivers/infiniband/hw/amso1100/c2_ae.c
> index 08f46c8..3aae497 100644
> --- a/drivers/infiniband/hw/amso1100/c2_ae.c
> +++ b/drivers/infiniband/hw/amso1100/c2_ae.c
> @@ -197,7 +197,7 @@ void c2_ae_event(struct c2_dev *c2dev, u
> "resource=%x, qp_state=%s\n",
> __FUNCTION__,
> to_event_str(event_id),
> -   be64_to_cpu(wr->ae.ae_generic.user_context),
> +   (unsigned long long) be64_to_cpu(wr->ae.ae_generic.user_context),
> be32_to_cpu(wr->ae.ae_generic.resource_type),
> be32_to_cpu(wr->ae.ae_generic.resource),
> to_qp_state_str(be32_to_cpu(wr->ae.ae_generic.qp_state)));
> diff --git a/drivers/infiniband/hw/amso1100/c2_alloc.c
> b/drivers/infiniband/hw/amso1100/c2_alloc.c
> index 1d25299..028a60b 100644
> --- a/drivers/infiniband/hw/amso1100/c2_alloc.c
> +++ b/drivers/infiniband/hw/amso1100/c2_alloc.c
> @@ -115,7 +115,7 @@ u16 *c2_alloc_mqsp(struct c2_dev *c2dev,
>((unsigned long) &(head->shared_ptr[mqsp]) -
> (unsigned long) head);
> pr_debug("%s addr %p dma_addr %llx\n", __FUNCTION__,
> -&(head->shared_ptr[mqsp]), (u64)*dma_addr);
> +&(head->shared_ptr[mqsp]), (unsigned long long) *dma_addr);
> return &(head->shared_ptr[mqsp]);
> }
> return NULL;
> diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c
> b/drivers/infiniband/hw/amso1100/c2_provider.c
> index dd6af55..622d6f1 100644
> --- a/drivers/infiniband/hw/amso1100/c2_provider.c
> +++ b/drivers/infiniband/hw/amso1100/c2_provider.c
> @@ -397,7 +397,9 @@ static struct ib_mr *c2_reg_phys_mr(stru
> pr_debug("%s - page shift %d, pbl_depth %d, total_len %u, "
> "*iova_start %llx, first pa %llx, last pa %llx\n",
> __FUNCTION__, page_shift, pbl_depth, total_len,
> -  *iova_start, page_list[0], page_list[pbl_depth-1]);
> +  (unsigned long long) *iova_start,
> + (unsigned long long) page_list[0],
> + (unsigned long long) page_list[pbl_depth-1]);
> err = c2_nsmr_register_phys_kern(to_c2dev(ib_pd->device), page_list,
> (1 << page_shift), pbl_depth,
> total_len, 0, iova_start,
> diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c
> b/drivers/infiniband/hw/amso1100/c2_rnic.c
> index f49a32b..e37c568 100644
> --- a/drivers/infiniband/hw/amso1100/c2_rnic.c
> +++ b/drivers/infiniband/hw/amso1100/c2_rnic.c
> @@ -527,7 +527,7 @@ int c2_rnic_init(struct c2_dev *c2dev)
> DMA_FROM_DEVICE);
> pci_unmap_addr_set(&c2dev->rep_vq, mapping, c2dev->rep_vq.host_dma);
> pr_debug("%s rep_vq va %p dma %llx\n", __FUNCTION__, q1_pages,
> -   (u64)c2dev->rep_vq.host_dma);
> +   (unsigned long long) c2dev->rep_vq.host_dma);
> c2_mq_rep_init(&c2dev->rep_vq,
>   1,
>   qsize,
> @@ -550,7 +550,7 @@ int c2_rnic_init(struct c2_dev *c2dev)
> DMA_FROM_DEVICE);
> pci_unmap_addr_set(&c2dev->aeq, mapping, c2dev->aeq.host_dma);
> pr_debug("%s aeq va %p dma %llx\n", __FUNCTION__, q1_pages,
> -   (u64)c2dev->rep_vq.host_dma);
> +   (unsigned long long) c2dev->rep_vq.host_dma);
> c2_mq_rep_init(&c2dev->aeq,
>   2,
>   qsize,



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib: NAPI

2006-09-27 Thread Roland Dreier

Michael> Maybe we should just assign EQs to CQs in a round-robin
Michael> fashion for now, and just hope typical use allocates CQs
Michael> sequentially.  Worst case, we are back to where we are
Michael> now, performance-wise.  Roland, how does this sound?

I think what we should do is follow the IB verbs extensions and expose
multiple CQ event vectors, and let the consumer pick which one to use
when creating a CQ.  If IPoIB wants to go round robin itself, that
would be fine.

This is what I tried to set the userspace API up for.  Nothing in
userspace would have to change for this -- the kernel just needs to
add multiple EQ support.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib: NAPI

2006-09-27 Thread Roland Dreier

Shirley> I forgot to mention these NAPI parameters should be
Shirley> tunable for different device drivers, like dev->weight,
Shirley> or set up in lower driver.

Michael> So we need something like poll_weight in struct
Michael> ib_device, to give a hint on how expensive an interrupt
Michael> is versus poll?  Seems to make sense, and actually might
Michael> be useful for other ULPs.  Roland, what do you think?

How could a low-level driver possibly know the cost of an interrupt vs
polling a CQ?  It depends on the particular CPU/cache/chipset details
of the system and it might not even be the same from one PCI slot to
another.

If this value makes a real difference in practice, we can make it
tunable but I would like to see some hard benchmarks that show it
making a big difference one way or another.  But we have too many
knobs as it is so I'm inclined to just pick a value that works OK.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Compile warnings (cross build)

2006-09-27 Thread Roland Dreier

OK, this is what I just came up with to fix these.

Look OK to you Tom?

diff --git a/drivers/infiniband/hw/amso1100/c2_ae.c 
b/drivers/infiniband/hw/amso1100/c2_ae.c
index 08f46c8..3aae497 100644
--- a/drivers/infiniband/hw/amso1100/c2_ae.c
+++ b/drivers/infiniband/hw/amso1100/c2_ae.c
@@ -197,7 +197,7 @@ void c2_ae_event(struct c2_dev *c2dev, u
"resource=%x, qp_state=%s\n",
__FUNCTION__,
to_event_str(event_id),
-   be64_to_cpu(wr->ae.ae_generic.user_context),
+   (unsigned long long) 
be64_to_cpu(wr->ae.ae_generic.user_context),
be32_to_cpu(wr->ae.ae_generic.resource_type),
be32_to_cpu(wr->ae.ae_generic.resource),

to_qp_state_str(be32_to_cpu(wr->ae.ae_generic.qp_state)));
diff --git a/drivers/infiniband/hw/amso1100/c2_alloc.c 
b/drivers/infiniband/hw/amso1100/c2_alloc.c
index 1d25299..028a60b 100644
--- a/drivers/infiniband/hw/amso1100/c2_alloc.c
+++ b/drivers/infiniband/hw/amso1100/c2_alloc.c
@@ -115,7 +115,7 @@ u16 *c2_alloc_mqsp(struct c2_dev *c2dev,
((unsigned long) &(head->shared_ptr[mqsp]) -
 (unsigned long) head);
pr_debug("%s addr %p dma_addr %llx\n", __FUNCTION__,
-&(head->shared_ptr[mqsp]), (u64)*dma_addr);
+&(head->shared_ptr[mqsp]), (unsigned long long) 
*dma_addr);
return &(head->shared_ptr[mqsp]);
}
return NULL;
diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c 
b/drivers/infiniband/hw/amso1100/c2_provider.c
index dd6af55..622d6f1 100644
--- a/drivers/infiniband/hw/amso1100/c2_provider.c
+++ b/drivers/infiniband/hw/amso1100/c2_provider.c
@@ -397,7 +397,9 @@ static struct ib_mr *c2_reg_phys_mr(stru
pr_debug("%s - page shift %d, pbl_depth %d, total_len %u, "
"*iova_start %llx, first pa %llx, last pa %llx\n",
__FUNCTION__, page_shift, pbl_depth, total_len,
-   *iova_start, page_list[0], page_list[pbl_depth-1]);
+   (unsigned long long) *iova_start,
+   (unsigned long long) page_list[0],
+   (unsigned long long) page_list[pbl_depth-1]);
err = c2_nsmr_register_phys_kern(to_c2dev(ib_pd->device), page_list,
 (1 << page_shift), pbl_depth,
 total_len, 0, iova_start,
diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c 
b/drivers/infiniband/hw/amso1100/c2_rnic.c
index f49a32b..e37c568 100644
--- a/drivers/infiniband/hw/amso1100/c2_rnic.c
+++ b/drivers/infiniband/hw/amso1100/c2_rnic.c
@@ -527,7 +527,7 @@ int c2_rnic_init(struct c2_dev *c2dev)
DMA_FROM_DEVICE);
pci_unmap_addr_set(&c2dev->rep_vq, mapping, c2dev->rep_vq.host_dma);
pr_debug("%s rep_vq va %p dma %llx\n", __FUNCTION__, q1_pages,
-(u64)c2dev->rep_vq.host_dma);
+(unsigned long long) c2dev->rep_vq.host_dma);
c2_mq_rep_init(&c2dev->rep_vq,
   1,
   qsize,
@@ -550,7 +550,7 @@ int c2_rnic_init(struct c2_dev *c2dev)
DMA_FROM_DEVICE);
pci_unmap_addr_set(&c2dev->aeq, mapping, c2dev->aeq.host_dma);
pr_debug("%s aeq va %p dma %llx\n", __FUNCTION__, q1_pages,
-(u64)c2dev->rep_vq.host_dma);
+(unsigned long long) c2dev->rep_vq.host_dma);
c2_mq_rep_init(&c2dev->aeq,
   2,
   qsize,

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] 2.6.18 kernel support in the main trunk.

2006-09-27 Thread Bryan O'Sullivan

On Wed, 2006-09-27 at 14:24 -0700, Roland Dreier wrote:
> Do we have to keep the kernel modules in svn limping along?  As time
> goes on, I have less and less patience for double maintenance.

I'm still all in favour of nuking them...

http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] RDMA CM callback status

2006-09-27 Thread Sean Hefty

Sean Hefty wrote:
>>1. Should I even be looking at event->status or does the event type tell me
>>  everything I need to know?  I've had a report that the assertion
>>  (event->status != 0) is failing on RDMA_CM_EVENT_ROUTE_ERROR.
> 
> It sounds like (and looks like from reading the code) that you've hit a bug 
> with
> the ROUTE_ERROR event.  The failure status isn't being propagated up to the
> user.

I've committed a patch to svn which will set the event status correctly when a 
route error occurs.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] 2.6.18 kernel support in the main trunk.

2006-09-27 Thread Roland Dreier

Do we have to keep the kernel modules in svn limping along?  As time
goes on, I have less and less patience for double maintenance.

Oh well, since you provided the patch I'll apply it.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] backporting fixes

2006-09-27 Thread Roland Dreier

 > Now that  2.6.18 (with an additional patch) I looked at backporting bugfixes 
 > to
 > older kernels.  The main problem I see is that the neighbour destructor
 > interface change is not in 2.6.16, so IPoIB crashes randomly.
 > 
 > So approaches are
 > - Try to push the change into 2.6.16 by netdev
 > - Use the all-neighbour list as done by ofed
 > - Abandon the whole project
 > 
 > Ideas?

Unfortunately I don't think this bug is very amenable to being fixed
in a 2.6.16/-stable tree.  So the third solution is probably the best
we can do at this point.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] fix cma_leave_mc_groups

2006-09-27 Thread Sean Hefty

Krishna Kumar wrote:
> - cma_leave_mc_groups can race with other routines updating
>   or reading the mclist, so use lock. Eg while doing a
>   rdma_destroy_id(), other processes could be looking at
>   this id and de-referencing mclist.

I don't think that there's an issue here.

The mc_list is only accessed by other direct API calls.  For example, 
rdma_join_multicast() or rdma_leave_multicast().  A user cannot call 
rdma_destroy_id() with other API calls.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Ishai Rabinovitz

Roland Dreier wrote:
> Maybe we should just use the port GUID instead of the node GUID to
> form the initiator ID?  That would solve this pretty cleanly I think.


This is also Vu's idea.

There are two issues:

1) My patch allows a sophisticated user to have two logical connections on
the same physical solution. He can have different connection parameters
(e.g., MAX_CMD_PER_LUN) according to the application needs.
 Do you think there is such need?

2) In the current implementation there is a problem when there are two
connections on the same physical connection - when the second connection
sends REQ to the target, the target sends a DREQ to the first connection,
but when someone tries to access the first scsi_host, ib_srp tries to
reconnect the first connection and then the second connection gets a DREQ
- and so the ping pong goes.
And if there is a multipath daemon that checks the status of the
connections this ping pong can be for ever.
We need to find a way to eliminate this behavior.

Ishai


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] Fix freed mem deref race in cma_process_remove/cma_req_handler

2006-09-27 Thread Sean Hefty

Good catch.  Thanks - committed.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Roland Dreier

Maybe we should just use the port GUID instead of the node GUID to
form the initiator ID?  That would solve this pretty cleanly I think.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib: NAPI

2006-09-27 Thread Shirley Ma


I have created a patch to monitor CQ. That wasn't the reason for performance drop. I couldn't see any race from the output.

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] id_priv_list->list is not initialized sometimes

2006-09-27 Thread Sean Hefty

Krishna Kumar wrote:
> rdma_listen could be called from a context where id_priv->list
> is not initialized. Then at a later stage, a cma_cancel_listen
> does a list_del() which could oops since this element is not
> on any list. 
> 
> Eg, in rdma_listen(), if id->device is !NULL, it calls
> cma_ib_listen() which doesn't add this id to any list. A
> cma_cancel_listen() will do a list_del.

I don't think this is needed.  cma_cancel_listens() is only called if the id is 
listening across multiple devices (and id->device is NULL).  See 
cma_cancel_operation().

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] ucma : Encapsulate duplicate code to common routine

2006-09-27 Thread Sean Hefty

Krishna Kumar wrote:
> Encapsulate duplicate code to common routine - avoid checking same
> errors in multiple places.

I went back and forth on this, but ended up committing it, since it does 
slightly simplify maintenance.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED Status

2006-09-27 Thread Woodruff, Robert J

Aviram wrote,
>Pending that IPoIB HA is solved would like to issue RC7 that suppose to

>be final. Is everyone OK with this approach?


>Aviram

Sounds good,

What is the target date for RC7 ?  

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [RFC] [PATCH] ib_cm: send DREP in response to unmatched DREQ

2006-09-27 Thread Sean Hefty

Sean Hefty wrote:
> Currently a DREP is only sent in response to a DREQ if a connection
> has been found matching the DREQ, and it is in the proper state.  Once
> a DREP is sent, the local connection moves into timewait.  Duplicate
> DREQs received while in this state result in re-sending the DREP.
> 
> However, it's likely that the local connection will enter and exit
> timewait before the remote side times out a lost DREP and resends a DREQ.
> There are a couple possible solutions to this.  One is to increase how
> long a connection remains in timewait, by multiplying its wait time by
> max_cm_retries.  This can greatly increase the timewait state before a QP
> can be re-used when CM messages are not lost.
> 
> An alternative is to send a DREP in response to a DREQ, even if a local
> connection is not found, which is what this patch does.

If there are no objections, I will commit this patch to svn, and submit for 
inclusion upstream.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Different byte order between gen1 CM and gen2 CM ->RE: How to connect gen2 CM to gen1 IBGD CM?

2006-09-27 Thread Sean Hefty

Sean Hefty wrote:
> The byte ordering in the kernel APIs are fairly clear about this, but that 
> documentation didn't carry up to userspace everywhere.  I will update the 
> userspace documentation, but it may take me a few weeks to get to this.

I've added some additional comments next to structure fields that are specified 
in network-byte order.  Hopefully this will help others avoid running into 
similar issues.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] mvapich2-gen2 svn - vapi <--> gen2 ??

2006-09-27 Thread Bryan Green

Hello,
Regarding mvapich2-gen2 in the openib svn,
can an mvapich2 vapi build on one machine
communicate with a gen2 build on another?

-bryan


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] 90-ib.rules incorrect?

2006-09-27 Thread EI

Ishai,udev in OpenSuSE 10.2 alpha gives an error with the current rules file that are using (=). EugeneOn 9/27/06, 
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
In early versions of udev the syntax was different. The syntax used (=)and not (==).RHEL4 for example is still using such old version of udev.Apparently the new udev versions (used for example in SLES10) still
supports the old syntax.So this way we can have one file that suits both udev versions.Ishai> Isn't the format of 90-ib.rules in> 
https://openfabrics.org/svn/gen2/trunk/ofed/openib/scripts/90-ib.rulesincorrect.>> We have>> KERNEL="umad*", NAME="infiniband/%k", which should be> KERNEL=="umad*", NAME="infiniband/%k"
>> Am I missing something?>> Eugene> ___> openib-general mailing list> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general>> To unsubscribe, please visit> 
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] oops after rmmod ib_cm when stopping iSER

2006-09-27 Thread Sean Hefty

Erez Zilber wrote:
> When stopping iSER, we run 'modprobe -r ib_iser'. Then, we see an oops 
> (below). In order to check which module caused that oops, I replaced the 
> 'modprobe -r' call with rmmod for each module:
> 
> rmmod ib_iser
> rmmod libiscsi
> rmmod scsi_transport_iscsi
> rmmod rdma_cm
> rmmod ib_addr
> rmmod ib_cm
> 
> If I wait a few seconds before the removal of ib_cm, everything is ok.

Thanks for the info.  My guess is that the cm_id's are not taking a reference 
on 
the cm devices, which is allowing the module unload to proceed while cm_id's 
still remain in timewait.  I will look at this in more detail and work on a 
patch.  How reproducible is this?

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] IB/SRP: allowing multiple connections from taregt to initiator

2006-09-27 Thread Ishai Rabinovitz


SRP High Availability should enable an initiator to connect to the same target
several times, e.g., once from each IB port of the target.

Some targets do not support multichannel. In order to work with them as well
we will use another identifier_extension to the initiator port for each target
connection.

Signed-off-by: Ishai Rabinovitz <[EMAIL PROTECTED]>

---

I think this is the best solution. It allows users to use all four physical
connections from the initiator to target.

It also allows users to have several logical connections on one physical
connection (If they want connection with different attributes - for example
different max_cmd_per_lun).

It is SRP spec compliant.

I also added a module param, so it is possible to turn this option off.

Index: latest/drivers/infiniband/ulp/srp/ib_srp.c
===
--- latest.orig/drivers/infiniband/ulp/srp/ib_srp.c 2006-09-27 
10:36:13.0 +0300
+++ latest/drivers/infiniband/ulp/srp/ib_srp.c  2006-09-27 16:48:12.0 
+0300
@@ -85,6 +85,13 @@ MODULE_PARM_DESC(mellanox_workarounds,
 
 static const u8 mellanox_oui[3] = { 0x00, 0x02, 0xc9 };
 
+static int variable_identifier_extension = 1;
+
+module_param(variable_identifier_extension, int, 0444);
+MODULE_PARM_DESC(variable_identifier_extension,
+"Use another identifier_extension on each connection to target"
+", allows multichannel connection on all targets if != 0");
+
 static void srp_add_one(struct ib_device *device);
 static void srp_remove_one(struct ib_device *device);
 static void srp_completion(struct ib_cq *cq, void *target_ptr);
@@ -329,6 +336,7 @@ static int srp_send_req(struct srp_targe
req->priv.req_it_iu_len = cpu_to_be32(srp_max_iu_len);
req->priv.req_buf_fmt   = cpu_to_be16(SRP_BUF_FORMAT_DIRECT |
  SRP_BUF_FORMAT_INDIRECT);
+
/*
 * In the published SRP specification (draft rev. 16a), the 
 * port identifier format is 8 bytes of ID extension followed
@@ -341,13 +349,23 @@ static int srp_send_req(struct srp_targe
if (target->io_class == SRP_REV10_IB_IO_CLASS) {
memcpy(req->priv.initiator_port_id,
   target->srp_host->initiator_port_id + 8, 8);
-   memcpy(req->priv.initiator_port_id + 8,
-  target->srp_host->initiator_port_id, 8);
+   if (variable_identifier_extension)
+   memcpy(req->priv.initiator_port_id + 8,
+  &target, sizeof target);
+   else
+   memcpy(req->priv.initiator_port_id + 8,
+  target->srp_host->initiator_port_id, 8);
memcpy(req->priv.target_port_id, &target->ioc_guid, 8);
memcpy(req->priv.target_port_id + 8, &target->id_ext, 8);
} else {
-   memcpy(req->priv.initiator_port_id,
-  target->srp_host->initiator_port_id, 16);
+   if (variable_identifier_extension)
+   memcpy(req->priv.initiator_port_id,
+  &target, sizeof target);
+   else
+   memcpy(req->priv.initiator_port_id,
+  target->srp_host->initiator_port_id, 8);
+   memcpy(req->priv.initiator_port_id + 8,
+  target->srp_host->initiator_port_id + 8, 8);
memcpy(req->priv.target_port_id, &target->id_ext, 8);
memcpy(req->priv.target_port_id + 8, &target->ioc_guid, 8);
}
@@ -1823,7 +1841,8 @@ static struct srp_host *srp_add_port(str
host->dev  = device;
host->port = port;
 
-   host->initiator_port_id[7] = port;
+   if (!variable_identifier_extension)
+   host->initiator_port_id[7] = port;
memcpy(host->initiator_port_id + 8, &device->dev->node_guid, 8);
 
host->class_dev.class = &srp_class;
-- 
Ishai Rabinovitz

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] FW: Mstflint - not working on ppc64 and whendriver is not loaded on AMD

2006-09-27 Thread Tseng-Hui (Frank) Lin

On Wed, 2006-09-27 at 18:19 +0300, Michael S. Tsirkin wrote:
> Quoting r. Moshe Kazir <[EMAIL PROTECTED]>:
> > Subject: FW: [openib-general] Mstflint - not working on ppc64 and 
> > whendriver is not loaded on AMD
> > 
> > Michael,
> >  
> > Frank new version was tested once more in Voltaire and is working o.k. .
> > I tested  `./mstflint -d  q`  when drivers are loaded and 
> > when drivers are not loaded. in all cases it worked o.k.
> 
> Thanks for testing, but I'd like to get a handle on what's going on first.
> 
> First, I'm pretty sure when driver is loaded things work OK on all systems.
> When driver is not loaded - could you please answer whether using
> /sys/bus/pci/devices/\:03\:00.0/resource0
> works for you (on systems that have resource0)?
> 

It doesn't work.

> >  
> > Test was ferformed on the following environments :
> >  
> > -IBM js21 ppc64 sles10 PCI-E
> > -IBM js21 ppc64 sles9 sp3 PCI-E
> > -IBM hs21 em64t redhat as 4 u3 PCI-E
> > -IBM hs21 em64t sles 9 sp3 PCI-E
> > -x86_64 sles10  PCI-E
> > -MAC ppc64 sles10 PCI-X
> > -MAC ppc64 sles10 PCI-E
> >
> > Please consider inserting the patch to OFED .
> >  
> > Moshe
> 
> Since I don't consider this a critical fix (there's no reason driver won't go
> up, and if it does not, there's a simple workaround by specifying the /proc
> interface, that is slower but works), I don't think this should go into OFED 
> 1.1.
> 
> Unfortunately, I never got a small bugfix patch against the latest mstflint -
> the patch I saw posted touches all kind of things all over the code -
> so I can't insert it in trunk, either.
> 

I agree this is not critical. The patch changes nothing but the way of
opening the device.

On some ppc64 and x86_64 machines, the I/O memory mapped by mmap() is
not accessable (return 0x) unless the kernel code (usually the
device driver) does an ioremap. This is why mmap resource0 does not work
on these machines. There is no way I am aware of can do ioremap from
user space code like mstflint. The only thing I can think of is to fall
back to use the config space file in /proc/bus/pci/.

The (big) patch I made checks if the faster way (mmap resource0) works.
It it doesn't, the patch tries other slower ways and use the fastest
working way it can find. That's all the patch does. It does not make big
fix. It just save the users trouble of trying all possible ways of
opening a devices manually.

I understand applying big patch is risky unless it can be throughly
tested. Unfortunately, no one has all the machines to test the patch.
Moshe and I have tested the patch on Power MAC, Squadrons, JS20, and
JS21 (almost all living ppc64 machines) as well as a few x86_64
machines. We believe this patch is safe for these machines. The patch
can be enabled by defining CONFIG_MOPEN_FALL_BACK to 1.
CONFIG_MOPEN_FALL_BACK is defined to 1 for ppc64 and x86_64 and 0 for
others. We can enable this patch on other machines when people who have
these machines tested the patch.

I agree this is no a critical patch, but it is a useful one. Moreover,
it is well tested on the machines with the patch enabled and change
nothing on the machines with the patch disabled. I believe this is a
safe patch. Please re-consider adding it. Thanks.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] 90-ib.rules incorrect?

2006-09-27 Thread ishai

In early versions of udev the syntax was different. The syntax used (=)
and not (==).
RHEL4 for example is still using such old version of udev.

Apparently the new udev versions (used for example in SLES10) still
supports the old syntax.

So this way we can have one file that suits both udev versions.

Ishai


> Isn't the format of 90-ib.rules in
> https://openfabrics.org/svn/gen2/trunk/ofed/openib/scripts/90-ib.rulesincorrect.
>
> We have
>
> KERNEL="umad*", NAME="infiniband/%k", which should be
> KERNEL=="umad*", NAME="infiniband/%k"
>
> Am I missing something?
>
> Eugene
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Vu Pham

Vu Pham wrote:
> Michael S. Tsirkin wrote:
> 
>>Quoting r. Vu Pham <[EMAIL PROTECTED]>:
>>
>>
>>>Either you can use multiple channels or derive different 
>>>initiator_port_ID in the login req to have multiple paths on 
>>>the same physical port
>>
>>
>>So how about we just stick a pointer inside the indentifier extension
>>instead of enabling multichannel?
>>
> 
> 
> That's the simple change. Beside that you have to maintain a 
> list of connections/channels connected to the same target, 
> to manage/clean-up resource associated with these 
> connections, how to handle error recovery especially target 
> reset and host reset...
> 
> What is the advantage to have multiple connections/qps on 
> the same physical port to the same target? The disavantages 
> are wasting resources, instability, no fail-over on physical 
> port error...
> 

I see the limitation of current srp implementation. If we 
have the following topoloty
host port 1 -- target port 1
host port 1 -- target port 2

the current srp implementation will use the same 
initiator_port_id for both login requests and the target 
will reject the second login if you don't turn on 
SUPPORT_MULTI_CHANNEL

Another way to solve this is to use different 
initiator_port_id for the logins ie.

path 1: initiator_port_ID{target_port1_GUID, 
initiator_port1_GUID} and target_port_ID{id_ext, ioc_guid}

path 2: 
initiator_port_ID{target_port2_GUID,initiator_port1_GUID} 
and target_port_ID

This also will guarantee the uniqueness of initiator_port_id 
in the fabric


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH 1/3] IB/iser: have iSER data transaction object pointing to iSER conn

2006-09-27 Thread Mike Christie

Erez Zilber wrote:
> iSER uses a data transaction object (struct iser_dto) as part
> of its IB data descriptors (struct iser_desc) management.
> It also uses a hierarchy of connection structures pointing to
> each other. A DTO may exist even after the iscsi_iser connection
> pointed by it is destructed (eg one that is bounded to post
> receive buffer which was flushed by the IB HW). Hence DTOs need
> point to the lowest connection, which is struct iser_conn.
> 
> Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>
> 

Both look fine to me.

One question not really related to your patches. How much work would you
guys have to do to iscsi_iser to support bi directional commands?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] FW: Mstflint - not working on ppc64 and whendriver is not loaded on AMD

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Moshe Kazir <[EMAIL PROTECTED]>:
> Subject: FW: [openib-general] Mstflint - not working on ppc64 and whendriver 
> is not loaded on AMD
> 
> Michael,
>  
> Frank new version was tested once more in Voltaire and is working o.k. .
> I tested  `./mstflint -d  q`  when drivers are loaded and when 
> drivers are not loaded. in all cases it worked o.k.

Thanks for testing, but I'd like to get a handle on what's going on first.

First, I'm pretty sure when driver is loaded things work OK on all systems.
When driver is not loaded - could you please answer whether using
/sys/bus/pci/devices/\:03\:00.0/resource0
works for you (on systems that have resource0)?

>  
> Test was ferformed on the following environments :
>  
> -IBM js21 ppc64 sles10 PCI-E
> -IBM js21 ppc64 sles9 sp3 PCI-E
> -IBM hs21 em64t redhat as 4 u3 PCI-E
> -IBM hs21 em64t sles 9 sp3 PCI-E
> -x86_64 sles10  PCI-E
> -MAC ppc64 sles10 PCI-X
> -MAC ppc64 sles10 PCI-E
>
> Please consider inserting the patch to OFED .
>  
> Moshe

Since I don't consider this a critical fix (there's no reason driver won't go
up, and if it does not, there's a simple workaround by specifying the /proc
interface, that is slower but works), I don't think this should go into OFED 
1.1.

Unfortunately, I never got a small bugfix patch against the latest mstflint -
the patch I saw posted touches all kind of things all over the code -
so I can't insert it in trunk, either.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Vu Pham

Michael S. Tsirkin wrote:
> Quoting r. Vu Pham <[EMAIL PROTECTED]>:
> 
>>Either you can use multiple channels or derive different 
>>initiator_port_ID in the login req to have multiple paths on 
>>the same physical port
> 
> 
> So how about we just stick a pointer inside the indentifier extension
> instead of enabling multichannel?
> 

That's the simple change. Beside that you have to maintain a 
list of connections/channels connected to the same target, 
to manage/clean-up resource associated with these 
connections, how to handle error recovery especially target 
reset and host reset...

What is the advantage to have multiple connections/qps on 
the same physical port to the same target? The disavantages 
are wasting resources, instability, no fail-over on physical 
port error...

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/ipoib: NAPI

2006-09-27 Thread Christoph Raisch

> Roland,
>
> > Do you know how ehca behaves?  Does it have that race?  ie what
> > happens in this situation:
> >
> > poll CQ -> CQ is empty
> > (new completion is added to CQ)
> > request notify on CQ
> > (no more completions are added)
> >
> > Mellanox HCAs will generate a CQ event in this case, although it's not
> > strictly required by the IB spec.  How will ehca behave?
> >
> >  - R.
>
> That could be the reason. I did see mthca poll empty entry, but not
> on ehca. I will confirm this with ehca team.
>
> Thanks
> Shirley Ma

It's possible that a race will happen between the interrupt handler, the
poll routine and the hardware.
By doing a

 poll CQ -> CQ is empty
 (new completion is added to CQ)
 request notify on CQ
 (no more completions are added)
 poll one more time

you should be on the safe side.

Gruss / Regards . . . Christoph Raisch



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH 2/3] IB/iser: dma unmap an unaligned for rdma data before touching it

2006-09-27 Thread Erez Zilber

(This patch may be a duplicate. Something went wrong with my previous 
mail.)

iSER uses the dma mapping api to map the page holding the
scsi command data to the hca dma address space. When the
command data is not aligned for rdma, the data is copied
to/from an allocated buffer which in turn is used for
executing this command. The pages associated with the
command must be unmapped before being touched.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/iser/iscsi_iser.h |7 
 drivers/infiniband/ulp/iser/iser_initiator.c |   49 +-
 drivers/infiniband/ulp/iser/iser_memory.c|   42 ++
 3 files changed, 59 insertions(+), 39 deletions(-)

78a237418bd3547cfeb49828a8b857ac5241749f
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 7f44636..4a7069f 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -350,4 +350,11 @@ int  iser_post_send(struct iser_desc *tx
 
 int iser_conn_state_comp(struct iser_conn *ib_conn,
 enum iser_ib_conn_state comp);
+
+int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
+   struct iser_data_buf   *data,
+   enum   iser_data_dir   iser_dir,
+   enum   dma_data_direction  dma_dir);
+
+void iser_dma_unmap_task_data(struct iscsi_iser_cmd_task *iser_ctask);
 #endif
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 14ae61e..9b3d79c 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -66,42 +66,6 @@ static void iser_dto_add_regd_buff(struc
dto->regd_vector_len++;
 }
 
-static int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
- struct iser_data_buf   *data,
- enum   iser_data_dir   iser_dir,
- enum   dma_data_direction  dma_dir)
-{
-   struct device *dma_device;
-
-   iser_ctask->dir[iser_dir] = 1;
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-
-   data->dma_nents = dma_map_sg(dma_device, data->buf, data->size, 
dma_dir);
-   if (data->dma_nents == 0) {
-   iser_err("dma_map_sg failed!!!\n");
-   return -EINVAL;
-   }
-   return 0;
-}
-
-static void iser_dma_unmap_task_data(struct iscsi_iser_cmd_task *iser_ctask)
-{
-   struct device  *dma_device;
-   struct iser_data_buf *data;
-
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-
-   if (iser_ctask->dir[ISER_DIR_IN]) {
-   data = &iser_ctask->data[ISER_DIR_IN];
-   dma_unmap_sg(dma_device, data->buf, data->size, 
DMA_FROM_DEVICE);
-   }
-
-   if (iser_ctask->dir[ISER_DIR_OUT]) {
-   data = &iser_ctask->data[ISER_DIR_OUT];
-   dma_unmap_sg(dma_device, data->buf, data->size, DMA_TO_DEVICE);
-   }
-}
-
 /* Register user buffer memory and initialize passive rdma
  *  dto descriptor. Total data size is stored in
  *  iser_ctask->data[ISER_DIR_IN].data_len
@@ -699,14 +663,19 @@ void iser_ctask_rdma_init(struct iscsi_i
 void iser_ctask_rdma_finalize(struct iscsi_iser_cmd_task *iser_ctask)
 {
int deferred;
+   int is_rdma_aligned = 1;
 
/* if we were reading, copy back to unaligned sglist,
 * anyway dma_unmap and free the copy
 */
-   if (iser_ctask->data_copy[ISER_DIR_IN].copy_buf != NULL)
+   if (iser_ctask->data_copy[ISER_DIR_IN].copy_buf != NULL) {
+   is_rdma_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_ctask, ISER_DIR_IN);
-   if (iser_ctask->data_copy[ISER_DIR_OUT].copy_buf != NULL)
+   }
+   if (iser_ctask->data_copy[ISER_DIR_OUT].copy_buf != NULL) {
+   is_rdma_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_ctask, ISER_DIR_OUT);
+   }
 
if (iser_ctask->dir[ISER_DIR_IN]) {
deferred = iser_regd_buff_release
@@ -726,7 +695,9 @@ void iser_ctask_rdma_finalize(struct isc
}
}
 
-   iser_dma_unmap_task_data(iser_ctask);
+   /* if the data was unaligned, it was already unmapped and then copied */
+   if (is_rdma_aligned)
+   iser_dma_unmap_task_data(iser_ctask);
 }
 
 void iser_dto_buffs_release(struct iser_dto *dto)
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 31950a5..0f87163 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -360,6 +360,44 @@ static void iser_page_vec_build(struct i
}
 }
 
+int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
+   struct ise

[openib-general] [PATCH 1/3] IB/iser: have iSER data transaction object pointing to iSER conn

2006-09-27 Thread Erez Zilber

(This patch may be a duplicate. Something went wrong with my previous 
mail.)

iSER uses a data transaction object (struct iser_dto) as part
of its IB data descriptors (struct iser_desc) management.
It also uses a hierarchy of connection structures pointing to
each other. A DTO may exist even after the iscsi_iser connection
pointed by it is destructed (eg one that is bounded to post
receive buffer which was flushed by the IB HW). Hence DTOs need
point to the lowest connection, which is struct iser_conn.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/iser/iscsi_iser.c |2 ++
 drivers/infiniband/ulp/iser/iscsi_iser.h |2 +-
 drivers/infiniband/ulp/iser/iser_initiator.c |   11 ++-
 drivers/infiniband/ulp/iser/iser_verbs.c |8 +---
 4 files changed, 14 insertions(+), 9 deletions(-)

57b132002a5e3bf3ba0ae362f174404e29c69449
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c 
b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 101e407..b37f429 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -317,6 +317,8 @@ iscsi_iser_conn_destroy(struct iscsi_cls
struct iscsi_iser_conn *iser_conn = conn->dd_data;
 
iscsi_conn_teardown(cls_conn);
+   if (iser_conn->ib_conn)
+   iser_conn->ib_conn->iser_conn = NULL;
kfree(iser_conn);
 }
 
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 7c3d0c9..7f44636 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -187,7 +187,7 @@ struct iser_regd_buf {
 
 struct iser_dto {
struct iscsi_iser_cmd_task *ctask;
-   struct iscsi_iser_conn *conn;
+   struct iser_conn *ib_conn;
intnotify_enable;
 
/* vector of registered buffers */
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index ccf56f6..14ae61e 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -249,7 +249,7 @@ static int iser_post_receive_control(str
}
 
recv_dto = &rx_desc->dto;
-   recv_dto->conn  = iser_conn;
+   recv_dto->ib_conn = iser_conn->ib_conn;
recv_dto->regd_vector_len = 0;
 
regd_hdr = &rx_desc->hdr_regd_buf;
@@ -296,7 +296,7 @@ static void iser_create_send_desc(struct
regd_hdr->virt_addr  = tx_desc; /* == &tx_desc->iser_header */
regd_hdr->data_size  = ISER_TOTAL_HEADERS_LEN;
 
-   send_dto->conn  = iser_conn;
+   send_dto->ib_conn = iser_conn->ib_conn;
send_dto->notify_enable   = 1;
send_dto->regd_vector_len = 0;
 
@@ -588,7 +588,7 @@ void iser_rcv_completion(struct iser_des
 unsigned long dto_xfer_len)
 {
struct iser_dto*dto = &rx_desc->dto;
-   struct iscsi_iser_conn *conn = dto->conn;
+   struct iscsi_iser_conn *conn = dto->ib_conn->iser_conn;
struct iscsi_session *session = conn->iscsi_conn->session;
struct iscsi_cmd_task *ctask;
struct iscsi_iser_cmd_task *iser_ctask;
@@ -641,7 +641,8 @@ void iser_rcv_completion(struct iser_des
 void iser_snd_completion(struct iser_desc *tx_desc)
 {
struct iser_dto*dto = &tx_desc->dto;
-   struct iscsi_iser_conn *iser_conn = dto->conn;
+   struct iser_conn   *ib_conn = dto->ib_conn;
+   struct iscsi_iser_conn *iser_conn = ib_conn->iser_conn;
struct iscsi_conn  *conn = iser_conn->iscsi_conn;
struct iscsi_mgmt_task *mtask;
 
@@ -652,7 +653,7 @@ void iser_snd_completion(struct iser_des
if (tx_desc->type == ISCSI_TX_DATAOUT)
kmem_cache_free(ig.desc_cache, tx_desc);
 
-   atomic_dec(&iser_conn->ib_conn->post_send_buf_count);
+   atomic_dec(&ib_conn->post_send_buf_count);
 
write_lock(conn->recv_lock);
if (conn->suspend_tx) {
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 72febf1..11d4e87 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -570,6 +570,8 @@ void iser_conn_release(struct iser_conn 
/* on EVENT_ADDR_ERROR there's no device yet for this conn */
if (device != NULL)
iser_device_try_release(device);
+   if (ib_conn->iser_conn)
+   ib_conn->iser_conn->ib_conn = NULL;
kfree(ib_conn);
 }
 
@@ -692,7 +694,7 @@ int iser_post_recv(struct iser_desc *rx_
struct iser_dto   *recv_dto = &rx_desc->dto;
 
/* Retrieve conn */
-   ib_conn = recv_dto->conn->ib_conn;
+   ib_conn = recv_dto->ib_conn;
 
iser_dto_to_iov(recv_dto, iov, 2);
 
@@ -725,7 +727,7 @@ int iser_post_send(struct iser_desc *tx_
struct iser_conn  *ib_conn;
struct iser_dto   *dto = &tx_desc->dto;
 
-   ib_con

[openib-general] [PATCH 3/3] IB/iser: fix the description of iSER in Kconfig

2006-09-27 Thread Erez Zilber

fix the description of iSER in Kconfig. It is not accurate.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/iser/Kconfig |   11 ++-
 1 files changed, 6 insertions(+), 5 deletions(-)

e6a8887cad4e2270c5173451e8b706b907b88133
diff --git a/drivers/infiniband/ulp/iser/Kconfig 
b/drivers/infiniband/ulp/iser/Kconfig
index fead87d..80f6716 100644
--- a/drivers/infiniband/ulp/iser/Kconfig
+++ b/drivers/infiniband/ulp/iser/Kconfig
@@ -1,11 +1,12 @@
 config INFINIBAND_ISER
-   tristate "ISCSI RDMA Protocol"
+   tristate "iSCSI Extensions for RDMA (iSER)"
depends on INFINIBAND && SCSI
select SCSI_ISCSI_ATTRS
---help---
- Support for the ISCSI RDMA Protocol over InfiniBand.  This
- allows you to access storage devices that speak ISER/ISCSI
+ Support for the iSCSI Extensions for RDMA (iSER) Protocol over 
InfiniBand. This
+ allows you to access storage devices that speak iSCSI over iSER
  over InfiniBand.
 
- The ISER protocol is defined by IETF.
- See .
+ The iSER protocol is defined by IETF.
+ See 
+ and 
-- 
1.2.6




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH 2/3] IB/iser: dma unmap an unaligned for rdma data before touching it

2006-09-27 Thread Erez Zilber

iSER uses the dma mapping api to map the page holding the
scsi command data to the hca dma address space. When the
command data is not aligned for rdma, the data is copied
to/from an allocated buffer which in turn is used for
executing this command. The pages associated with the
command must be unmapped before being are touched.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/iser/iscsi_iser.h |7 
 drivers/infiniband/ulp/iser/iser_initiator.c |   49 +-
 drivers/infiniband/ulp/iser/iser_memory.c|   42 ++
 3 files changed, 59 insertions(+), 39 deletions(-)

78a237418bd3547cfeb49828a8b857ac5241749f
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 7f44636..4a7069f 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -350,4 +350,11 @@ int  iser_post_send(struct iser_desc *tx
 
 int iser_conn_state_comp(struct iser_conn *ib_conn,
 enum iser_ib_conn_state comp);
+
+int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
+   struct iser_data_buf   *data,
+   enum   iser_data_dir   iser_dir,
+   enum   dma_data_direction  dma_dir);
+
+void iser_dma_unmap_task_data(struct iscsi_iser_cmd_task *iser_ctask);
 #endif
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 14ae61e..9b3d79c 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -66,42 +66,6 @@ static void iser_dto_add_regd_buff(struc
dto->regd_vector_len++;
 }
 
-static int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
- struct iser_data_buf   *data,
- enum   iser_data_dir   iser_dir,
- enum   dma_data_direction  dma_dir)
-{
-   struct device *dma_device;
-
-   iser_ctask->dir[iser_dir] = 1;
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-
-   data->dma_nents = dma_map_sg(dma_device, data->buf, data->size, 
dma_dir);
-   if (data->dma_nents == 0) {
-   iser_err("dma_map_sg failed!!!\n");
-   return -EINVAL;
-   }
-   return 0;
-}
-
-static void iser_dma_unmap_task_data(struct iscsi_iser_cmd_task *iser_ctask)
-{
-   struct device  *dma_device;
-   struct iser_data_buf *data;
-
-   dma_device = 
iser_ctask->iser_conn->ib_conn->device->ib_device->dma_device;
-
-   if (iser_ctask->dir[ISER_DIR_IN]) {
-   data = &iser_ctask->data[ISER_DIR_IN];
-   dma_unmap_sg(dma_device, data->buf, data->size, 
DMA_FROM_DEVICE);
-   }
-
-   if (iser_ctask->dir[ISER_DIR_OUT]) {
-   data = &iser_ctask->data[ISER_DIR_OUT];
-   dma_unmap_sg(dma_device, data->buf, data->size, DMA_TO_DEVICE);
-   }
-}
-
 /* Register user buffer memory and initialize passive rdma
  *  dto descriptor. Total data size is stored in
  *  iser_ctask->data[ISER_DIR_IN].data_len
@@ -699,14 +663,19 @@ void iser_ctask_rdma_init(struct iscsi_i
 void iser_ctask_rdma_finalize(struct iscsi_iser_cmd_task *iser_ctask)
 {
int deferred;
+   int is_rdma_aligned = 1;
 
/* if we were reading, copy back to unaligned sglist,
 * anyway dma_unmap and free the copy
 */
-   if (iser_ctask->data_copy[ISER_DIR_IN].copy_buf != NULL)
+   if (iser_ctask->data_copy[ISER_DIR_IN].copy_buf != NULL) {
+   is_rdma_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_ctask, ISER_DIR_IN);
-   if (iser_ctask->data_copy[ISER_DIR_OUT].copy_buf != NULL)
+   }
+   if (iser_ctask->data_copy[ISER_DIR_OUT].copy_buf != NULL) {
+   is_rdma_aligned = 0;
iser_finalize_rdma_unaligned_sg(iser_ctask, ISER_DIR_OUT);
+   }
 
if (iser_ctask->dir[ISER_DIR_IN]) {
deferred = iser_regd_buff_release
@@ -726,7 +695,9 @@ void iser_ctask_rdma_finalize(struct isc
}
}
 
-   iser_dma_unmap_task_data(iser_ctask);
+   /* if the data was unaligned, it was already unmapped and then copied */
+   if (is_rdma_aligned)
+   iser_dma_unmap_task_data(iser_ctask);
 }
 
 void iser_dto_buffs_release(struct iser_dto *dto)
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c 
b/drivers/infiniband/ulp/iser/iser_memory.c
index 31950a5..0f87163 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -360,6 +360,44 @@ static void iser_page_vec_build(struct i
}
 }
 
+int iser_dma_map_task_data(struct iscsi_iser_cmd_task *iser_ctask,
+   struct iser_data_buf   *data,
+   enum   iser_data_dir

[openib-general] [PATCH 1/3] IB/iser: have iSER data transaction object pointing to iSER conn

2006-09-27 Thread Erez Zilber

iSER uses a data transaction object (struct iser_dto) as part
of its IB data descriptors (struct iser_desc) management.
It also uses a hierarchy of connection structures pointing to
each other. A DTO may exist even after the iscsi_iser connection
pointed by it is destructed (eg one that is bounded to post
receive buffer which was flushed by the IB HW). Hence DTOs need
point to the lowest connection, which is struct iser_conn.

Signed-off-by: Erez Zilber <[EMAIL PROTECTED]>

---

 drivers/infiniband/ulp/iser/iscsi_iser.c |2 ++
 drivers/infiniband/ulp/iser/iscsi_iser.h |2 +-
 drivers/infiniband/ulp/iser/iser_initiator.c |   11 ++-
 drivers/infiniband/ulp/iser/iser_verbs.c |8 +---
 4 files changed, 14 insertions(+), 9 deletions(-)

57b132002a5e3bf3ba0ae362f174404e29c69449
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c 
b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 101e407..b37f429 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -317,6 +317,8 @@ iscsi_iser_conn_destroy(struct iscsi_cls
struct iscsi_iser_conn *iser_conn = conn->dd_data;
 
iscsi_conn_teardown(cls_conn);
+   if (iser_conn->ib_conn)
+   iser_conn->ib_conn->iser_conn = NULL;
kfree(iser_conn);
 }
 
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 7c3d0c9..7f44636 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -187,7 +187,7 @@ struct iser_regd_buf {
 
 struct iser_dto {
struct iscsi_iser_cmd_task *ctask;
-   struct iscsi_iser_conn *conn;
+   struct iser_conn *ib_conn;
intnotify_enable;
 
/* vector of registered buffers */
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index ccf56f6..14ae61e 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -249,7 +249,7 @@ static int iser_post_receive_control(str
}
 
recv_dto = &rx_desc->dto;
-   recv_dto->conn  = iser_conn;
+   recv_dto->ib_conn = iser_conn->ib_conn;
recv_dto->regd_vector_len = 0;
 
regd_hdr = &rx_desc->hdr_regd_buf;
@@ -296,7 +296,7 @@ static void iser_create_send_desc(struct
regd_hdr->virt_addr  = tx_desc; /* == &tx_desc->iser_header */
regd_hdr->data_size  = ISER_TOTAL_HEADERS_LEN;
 
-   send_dto->conn  = iser_conn;
+   send_dto->ib_conn = iser_conn->ib_conn;
send_dto->notify_enable   = 1;
send_dto->regd_vector_len = 0;
 
@@ -588,7 +588,7 @@ void iser_rcv_completion(struct iser_des
 unsigned long dto_xfer_len)
 {
struct iser_dto*dto = &rx_desc->dto;
-   struct iscsi_iser_conn *conn = dto->conn;
+   struct iscsi_iser_conn *conn = dto->ib_conn->iser_conn;
struct iscsi_session *session = conn->iscsi_conn->session;
struct iscsi_cmd_task *ctask;
struct iscsi_iser_cmd_task *iser_ctask;
@@ -641,7 +641,8 @@ void iser_rcv_completion(struct iser_des
 void iser_snd_completion(struct iser_desc *tx_desc)
 {
struct iser_dto*dto = &tx_desc->dto;
-   struct iscsi_iser_conn *iser_conn = dto->conn;
+   struct iser_conn   *ib_conn = dto->ib_conn;
+   struct iscsi_iser_conn *iser_conn = ib_conn->iser_conn;
struct iscsi_conn  *conn = iser_conn->iscsi_conn;
struct iscsi_mgmt_task *mtask;
 
@@ -652,7 +653,7 @@ void iser_snd_completion(struct iser_des
if (tx_desc->type == ISCSI_TX_DATAOUT)
kmem_cache_free(ig.desc_cache, tx_desc);
 
-   atomic_dec(&iser_conn->ib_conn->post_send_buf_count);
+   atomic_dec(&ib_conn->post_send_buf_count);
 
write_lock(conn->recv_lock);
if (conn->suspend_tx) {
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c 
b/drivers/infiniband/ulp/iser/iser_verbs.c
index 72febf1..11d4e87 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -570,6 +570,8 @@ void iser_conn_release(struct iser_conn 
/* on EVENT_ADDR_ERROR there's no device yet for this conn */
if (device != NULL)
iser_device_try_release(device);
+   if (ib_conn->iser_conn)
+   ib_conn->iser_conn->ib_conn = NULL;
kfree(ib_conn);
 }
 
@@ -692,7 +694,7 @@ int iser_post_recv(struct iser_desc *rx_
struct iser_dto   *recv_dto = &rx_desc->dto;
 
/* Retrieve conn */
-   ib_conn = recv_dto->conn->ib_conn;
+   ib_conn = recv_dto->ib_conn;
 
iser_dto_to_iov(recv_dto, iov, 2);
 
@@ -725,7 +727,7 @@ int iser_post_send(struct iser_desc *tx_
struct iser_conn  *ib_conn;
struct iser_dto   *dto = &tx_desc->dto;
 
-   ib_conn = dto->conn->ib_conn;
+   ib_conn = dto->ib_conn;
 
iser_dto_to_i

[openib-general] [PATCH 0/3] IB/iser: bug fixes for 2.6.19 rc1

2006-09-27 Thread Erez Zilber

Roland,

Here is a series of patches for iSER. Most of them are bug fixes. I hope 
that they can be added to rc1.

Thanks
Erez


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Port reuse issue for rdma_cm/iwarp

2006-09-27 Thread Dennis Dalessandro

Has to do with the socket going into the time wait state, which is
because it is waiting for any possibly still in flight packets as
Caitlin said.  From what I was told, there is not really any option to
get around this with the Ammasso card. This was back when they were
still in business though, and for their ccil driver.  Probably better
off to use different ports.

-Dennis


On Tue, 2006-09-26 at 17:53 -0400, Sundeep Narravula wrote:
> > TCP restricts prompt re-use of the same Source/Destination
> > Address/Port pair while old traffic could still be in-flight.
> > This is generally not an issue because prompt re-use of the
> > exact four tuple is rare.
> >
> > Is there a special reason why your application needs to
> > reuse the same port from the active side? If the port number
> > is being used to identify the rank, could private data be
> > used instead?
> 
> Our application is primarily an invocation of multiple independent parallel
> jobs which all need to connect or each other on each invocation. Since
> this is a TCP limitation, is there any interface similar to setsockopt
> with TCP_NODELAY. We probably need to use different ports otherwise.
> 
> Thanks,
>   --Sundeep.
> 
> >
> >
> >
> 
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] enable GSO over IPoIB

2006-09-27 Thread Shirley Ma


"Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote on 09/27/2006 01:30:03 AM:
>Any idea what does ethtool do that IPoIB can't support?
ethtool is an ethernet device tool. It's OK to partically implement ethtool operations in IPoIB. We also need to patch the userlevel utility to support ibX interface. Now it only supports ethX.

thanks
Shirley Ma
IBM Linux Technology Center___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] heads-up - ipoib NAPI

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> You NAPI poll is driven either by receiver quota or any send CQE in CQ. Have 
> you tested UDP performance? any difference?

The thing to do currently is probably to wait for Roland to post an
updated patch, then test it.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] heads-up - ipoib NAPI

2006-09-27 Thread Shirley Ma

Hi, Eli,

Eli cohen <[EMAIL PROTECTED]> wrote on 09/26/2006 11:35:26 PM:
> On Tue, 2006-09-26 at 21:34 -0700, Shirley Ma wrote:
> 
> > NAPI patch moves ipoib poll from hardware interrupt context to softirq
> > context. It would reduce the hardware interrupts, reduce hardware
> > latency and induce some network latency. It might reduce cpu
> > utilization. But I still question about the BW improvement. I did see
> > various performance with the same test under the same condition.
> > 
> When you open just one connection you can see around 10% of variations
> in BW measure. But then you don't utilize all the CPU power you have and
> you don't get to the threshold where NAPI becomes effective.
> Using multiple connections utilizes all CPUs in the system, increases
> send rate, and increases the chances of the receiver to poll CQEs up to
> its quota and be scheduled again without re-enabling interrupts.

Send rate shouldn't be limited by one connection. The cpu is much faster than the link speed. I don't think multiple connections send rate is increased than one connection. Do you have any data to show that?

When I monitored the CQEs, I didn't see too many CQEs in CQ for one notification, and I don't think moving NAPI from hardware interrupt context to softirq context would increase that number. Or the latency might cause the number increased, I did see that number increased and performance increased with some udelay in hardware interrupt polling mode. If you saw the packets increased, how many packets did you see in both one hardware interrupt poll and one NAPI poll?

You NAPI poll is driven either by receiver quota or any send CQE in CQ. Have you tested UDP performance? any difference?

Thanks
Shirley Ma___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] enable GSO over IPoIB

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Shirley Ma <[EMAIL PROTECTED]>:
> Subject: Re: enable GSO over IPoIB
> 
> > Shirley> Since linux 2.6.18 supports GSO, I have patched IPoIB to
> > Shirley> enable GSO, but haven't tested the performance yet. Has
> > Shirley> anyone tried already?
> >
> > No, I don't think anyone looked at that yet.  Could you post your
> > patch?  What is required?  Supporting gather/scatter?
> >
> >  - R.
> 
> Don't need too. GSO only improves sender side performance. It allows large 
> packet send in ULPs, and segments this packet in interface layer before 
> driver xmit. The GSO enablement is through ethtool. Since ipoib doesn't 
> support ethtool, i just simply added a module parameter to set the interface 
> GSO flag when loading the module.

Any idea what does ethtool do that IPoIB can't support?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] oops after rmmod ib_cm when stopping iSER

2006-09-27 Thread Erez Zilber

Sean,

When stopping iSER, we run 'modprobe -r ib_iser'. Then, we see an oops 
(below). In order to check which module caused that oops, I replaced the 
'modprobe -r' call with rmmod for each module:

rmmod ib_iser
rmmod libiscsi
rmmod scsi_transport_iscsi
rmmod rdma_cm
rmmod ib_addr
rmmod ib_cm

If I wait a few seconds before the removal of ib_cm, everything is ok.

thyme login: Sep 27 09:50:08 thyme kernel: iser: 
iscsi_iser_ep_disconnect:ib conn 81005e426000 state 2
Sep 27 09:50:08 thyme kernel: iser: iser_cq_tasklet_fn:comp w. error op 
0 status 5
Sep 27 09:50:08 thyme last message repeated 3 times
Sep 27 09:50:08 thyme kernel: iser: iser_cma_handler:event 10 conn 
81005e426000 id 81006c304a00
Sep 27 09:50:08 thyme kernel: iser: iser_free_ib_conn_res:freeing conn 
81005e426000 cma_id 81006c304a00 fmr pool 8100560f2e40 qp f0
Sep 27 09:50:08 thyme kernel: iser: iser_device_try_release:device 
8100796037c0 refcount 0
Sep 27 09:50:09 thyme kernel: cma_cleanup: entry
Sep 27 09:50:09 thyme kernel: cma_cleanup: calling destroy_workqueue
Sep 27 09:50:09 thyme kernel: cma_cleanup: calling idr_destroy(&sdp_ps)
Sep 27 09:50:09 thyme kernel: cma_cleanup: calling idr_destroy(&tcp_ps)
Sep 27 09:50:09 thyme kernel: cma_cleanup: exit
Sep 27 09:50:09 thyme kernel: ib_cm_cleanup: entry
Sep 27 09:50:09 thyme kernel: ib_cm_cleanup: calling ib_unregister_client
Sep 27 09:50:09 thyme kernel: ib_cm_cleanup: calling idr_destroy
Sep 27 09:50:09 thyme kernel: ib_cm_cleanup: exit
Unable to handle kernel paging request at 8b02e017 RIP:
[] delayed_work_timer_fn+0x2c/0x40
PGD 203027 PUD 205027 PMD 0
Oops:  [1] SMP
CPU 3
Modules linked in: ib_uverbs ib_ipoib ib_sa autofs usbserial parport_pc 
lp parport edd cpufreq_userspace acpi_cpufreq thermal processor fan bud
Pid: 0, comm: swapper Not tainted 2.6.18-rc4-ga2d9f966-dirty #1
RIP: 0010:[] [] 
delayed_work_timer_fn+0x2c/0x40
RSP: 0018:81007e36fef8 EFLAGS: 00010246
RAX: 8b02dfff RBX: 0100 RCX: 81006b152d20
RDX: 0003 RSI: 810068576a00 RDI: 810068576a00
RBP: 81007e34 R08: efe9331445cb91ec R09: 81007e3a8008
R10:  R11: 0246 R12: 80241310
R13: 81007e36ff00 R14: 000a R15: 0003
FS: () GS:81007e344b40() knlGS:
CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2: 8b02e017 CR3: 60a44000 CR4: 06e0
Process swapper (pid: 0, threadinfo 81007e36a000, task 81007e347080)
Stack: 80239826 81007e36ff00 81007e36ff00 81000102ac20
 0096 0011 8065a110
806b2b20 0003 80235d0b 81007e36ff48
Call Trace:
 [] run_timer_softirq+0x156/0x1e0
[] __do_softirq+0x6b/0xe0
[] call_softirq+0x1c/0x34
[] do_softirq+0x2c/0x90
[] mwait_idle+0x0/0x50
[] apic_timer_interrupt+0x66/0x6c
 [] mwait_idle+0x36/0x50
[] cpu_idle+0x6a/0x90
[] start_secondary+0x499/0x4b0


Code: 48 8b 3c d0 e9 4b ff ff ff 66 66 66 90 66 66 66 90 66 66 90
RIP [] delayed_work_timer_fn+0x2c/0x40
RSP 
CR2: 8b02e017
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!

-- 



Erez Zilber | 972-9-971-7689

Software Engineer, Storage Team

Voltaire – _The Grid Backbone_

__

www.voltaire.com 




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [openfabrics-ewg] OFED Status

2006-09-27 Thread Scott Weitzenkamp (sweitzen)

Yes, this is fine with me.

Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Aviram Gutman
> Sent: Tuesday, September 26, 2006 9:01 AM
> To: EWG; Openib-General@Openib.Org
> Subject: [openfabrics-ewg] OFED Status
> 
> Hi,
> 
> OFED 1.1 RC6 was released on Thu.
> 
> The issues that were resolved since are:
> 
> 1) OpenIB Diags build on SLES10 ppc  - Solved by Moshe Katzir 
> from Voltaire
> 2)  iSER build on SLES10 needs root privilege - Voltaire fixed it
> 3) Bug #233 SDP crash on ipath - I believe MST fixed. Betsy 
> please confirm.
> 4) Fix IBDM to allow multiple devices on the same machine - 
> Eitan Zahavi 
> fixed
> 5) SRP HA - Fixed by Ishai
> 6) IPoIB HA on RH - Vlad made progess, issue is still not solved.
> 7) The CM fix that Arlin asked - In
> 
> Pending that IPoIB HA is solved would like to issue RC7 that 
> suppose to 
> be final. Is everyone OK with this approach?
> 
> 
> Aviram
> 
> ___
> openfabrics-ewg mailing list
> [EMAIL PROTECTED]
> http://openib.org/mailman/listinfo/openfabrics-ewg
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] IB/SRP: Enable multichannel

2006-09-27 Thread Michael S. Tsirkin

Quoting r. Vu Pham <[EMAIL PROTECTED]>:
> Either you can use multiple channels or derive different 
> initiator_port_ID in the login req to have multiple paths on 
> the same physical port

So how about we just stick a pointer inside the indentifier extension
instead of enabling multichannel?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

56 matches

Mail list logo