[openib-general] ipoib, ipv6 and multicast groups

2007-01-29 Thread chas williams - CONTRACTOR
recently our sm started throwing the following errors:

Jan 29 18:10:49 706710 [42003940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:49 706721 [42003940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 345113 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:51 345132 [42804940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 514312 [41802940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:51 514320 [41802940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 735732 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken

we tracked this down to a problem with ipoib interaction
with ipv6.  ipv6 joins two multicast groups, instead of 
just one like ipv4.

# netstat -A inet6 -g  -n
...
IPv6/IPv4 Group Memberships
Interface   RefCnt Group
--- -- -
lo  1  ff02::1
ib0 1  ff02::1:ff00:77a2
ib0 1  ff02::1


# netstat -A inet6 -g  -n
...
IPv6/IPv4 Group Memberships
Interface   RefCnt Group
--- -- -
lo  1  224.0.0.1
ib0 1  224.0.0.1


# cat /sys/kernel/debug/ipoib/ib0_mcg
GID: ff12:401b::0:0:0:0:1
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:401b::0:0:0::
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:601b::0:0:0:0:1
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:601b::0:0:1:ff00:77a2
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no


the ff02::1:ff00:77a2 group is specific to the interface (link local),
so each of our ib hosts running ipv6 registers its own unique multicast
group.  since our network is bigger than 32 hosts, it appears that we
have exceeded the multicast tables in our local switches and this is
making opensm generate the above error.

besides not running ipv6, are there any thoughts about this?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCHv5] IPoIB CM Experimental support

2007-01-11 Thread chas williams - CONTRACTOR
In message <[EMAIL PROTECTED]>,"Steve Wise" writes:
>What's the easy way to remove trailing spaces?  I seem to fat-finger
>them into my patches too. 

using vi, :%s/  *$//g
  ^^ -- this is two spaces
*$ means atleast one space at the end of the line.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Race in mthca_cmd_post()

2006-10-14 Thread chas williams - CONTRACTOR
In message <[EMAIL PROTECTED]>,Roland Dreier writes:
>That says you should do a read to flush writes, doesn't it??  What am
>I missing.

i guess my point is that you dont need to read from the device, you
could read from the bridge or a config register.

>The read that is failing is not going to DDR memory -- it going to a
>"safe" register.

i believe by safe register they meant the pci config register space
and not the memory mapped registers on the card.  looking at the trace
from the analyzer, there are a couple writes to config register (config
reg 1, PCI_COMMAND_IO) and then a read from the memory mapped region.

i would guess the read to the mmio region is flushing the writes to
the config register but the read happens "too soon" after those writes.
on a more mundance computer, the write/write/read probably wouldnt be
batched together.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Race in mthca_cmd_post()

2006-10-14 Thread chas williams - CONTRACTOR
In message <[EMAIL PROTECTED]>,Roland Dreier writes:
>How do you force out writes without doing a read?  I don't know of any
>other way to flush writes that is guaranteed by the PCI spec.

see Documentation/io_ordering.txt. 

>In any case that doesn't seem to be the problem here: the read is
>supposed to be done first, even in the source code.

i thought it might be because in a later message john said, 

>completing because the DDR memory is not yet available because SYS_EN never
>got down to the card before the readl, or did not complete before readl.

i would like to think that the altix isnt reordering read and writes
and that perhaps there needs to be a short delay between certain
writes.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Race in mthca_cmd_post()

2006-10-14 Thread chas williams - CONTRACTOR
In message <[EMAIL PROTECTED]>,"Roland Dreier" writes:
>saying that a read of PCI MMIO space is racing with a write -- and I
>would have thought that a read has to flush all posted writes.

a read does flush all the posted writes but that doesnt mean that
the write operation has had enough time to "complete".

i had a similar problem on the altix platform with posted writes.
part of the hw init was to write the reset register, wait a few ticks,
and then read the register until you saw a flag clear.  reading the
device "too soon" failed because it was in some poor state that didnt
respond properly.  with posted writes, you needed to force out the writes
(not using read obviously) and then wait the appropriate time.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] questions about gen2 srp driver

2006-02-10 Thread chas williams - CONTRACTOR
In message <[EMAIL PROTECTED]>,Roland Dreier writes:
>Yes, it's exactly because we know that work queues run in process
>context with interrupts enabled which lets us use spin_lock_irq.

thanks for the reply.  you are quite right.  i dont know what i was
thinking.

>There's no limitation on number of outstanding RDMAs targeting a
>single R_Key.

after looking at it further i finally see what the srp driver is
doing.  i didnt know that the rkey/lkey it gets during init applies
to entire host memory.  now, things make a little more sense.  part
of my confusion is that ->va really seems to mean physical address
not virtual address.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] questions about gen2 srp driver

2006-02-06 Thread chas williams - CONTRACTOR
i have been looking at the srp driver in the gen2 trunk (and the version
that is in the latest 2.6.15 kernels).  i have a couple questions about
its behavior and i am hoping someone can answer them.

it seems to take scsi_host->host_lock with a spin_lock_irq() inside
a couple of work queues.  i believe work queues run at process 
context and not interrupt context.  therefore, one should probably
use spin_lock_irqsave()?

secondly, there seems to be only one pair of lkeys/rkeys for a
given srp "virtual" host.  in srp_map_data() i see the rkey is
assigned to the buffer:

buf->key = cpu_to_be32(target->srp_host->mr->rkey);

but the virtual host adapter template says:

.can_queue  = SRP_SQ_SIZE,
.cmd_per_lun= SRP_SQ_SIZE,

if there is only a single set of rdma keys how can the driver support
more than one command (particularly on a target with multiple lun's)
outstanding command?  i didn't think the srp_post_send() was synchronus
with respect to the completion of the current rdma request?
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general