[openib-general] ipoib, ipv6 and multicast groups
recently our sm started throwing the following errors: Jan 29 18:10:49 706710 [42003940] -> __get_new_mlid: ERR 1B23: All available:32 mlids are taken Jan 29 18:10:49 706721 [42003940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed Jan 29 18:10:51 345113 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 mlids are taken Jan 29 18:10:51 345132 [42804940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed Jan 29 18:10:51 514312 [41802940] -> __get_new_mlid: ERR 1B23: All available:32 mlids are taken Jan 29 18:10:51 514320 [41802940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: __get_new_mlid failed Jan 29 18:10:51 735732 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 mlids are taken we tracked this down to a problem with ipoib interaction with ipv6. ipv6 joins two multicast groups, instead of just one like ipv4. # netstat -A inet6 -g -n ... IPv6/IPv4 Group Memberships Interface RefCnt Group --- -- - lo 1 ff02::1 ib0 1 ff02::1:ff00:77a2 ib0 1 ff02::1 # netstat -A inet6 -g -n ... IPv6/IPv4 Group Memberships Interface RefCnt Group --- -- - lo 1 224.0.0.1 ib0 1 224.0.0.1 # cat /sys/kernel/debug/ipoib/ib0_mcg GID: ff12:401b::0:0:0:0:1 created: 4298482097 queuelen: 0 complete: yes send_only: no GID: ff12:401b::0:0:0:: created: 4298482097 queuelen: 0 complete: yes send_only: no GID: ff12:601b::0:0:0:0:1 created: 4298482097 queuelen: 0 complete: yes send_only: no GID: ff12:601b::0:0:1:ff00:77a2 created: 4298482097 queuelen: 0 complete: yes send_only: no the ff02::1:ff00:77a2 group is specific to the interface (link local), so each of our ib hosts running ipv6 registers its own unique multicast group. since our network is bigger than 32 hosts, it appears that we have exceeded the multicast tables in our local switches and this is making opensm generate the above error. besides not running ipv6, are there any thoughts about this? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCHv5] IPoIB CM Experimental support
In message <[EMAIL PROTECTED]>,"Steve Wise" writes: >What's the easy way to remove trailing spaces? I seem to fat-finger >them into my patches too. using vi, :%s/ *$//g ^^ -- this is two spaces *$ means atleast one space at the end of the line. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Race in mthca_cmd_post()
In message <[EMAIL PROTECTED]>,Roland Dreier writes: >That says you should do a read to flush writes, doesn't it?? What am >I missing. i guess my point is that you dont need to read from the device, you could read from the bridge or a config register. >The read that is failing is not going to DDR memory -- it going to a >"safe" register. i believe by safe register they meant the pci config register space and not the memory mapped registers on the card. looking at the trace from the analyzer, there are a couple writes to config register (config reg 1, PCI_COMMAND_IO) and then a read from the memory mapped region. i would guess the read to the mmio region is flushing the writes to the config register but the read happens "too soon" after those writes. on a more mundance computer, the write/write/read probably wouldnt be batched together. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Race in mthca_cmd_post()
In message <[EMAIL PROTECTED]>,Roland Dreier writes: >How do you force out writes without doing a read? I don't know of any >other way to flush writes that is guaranteed by the PCI spec. see Documentation/io_ordering.txt. >In any case that doesn't seem to be the problem here: the read is >supposed to be done first, even in the source code. i thought it might be because in a later message john said, >completing because the DDR memory is not yet available because SYS_EN never >got down to the card before the readl, or did not complete before readl. i would like to think that the altix isnt reordering read and writes and that perhaps there needs to be a short delay between certain writes. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Race in mthca_cmd_post()
In message <[EMAIL PROTECTED]>,"Roland Dreier" writes: >saying that a read of PCI MMIO space is racing with a write -- and I >would have thought that a read has to flush all posted writes. a read does flush all the posted writes but that doesnt mean that the write operation has had enough time to "complete". i had a similar problem on the altix platform with posted writes. part of the hw init was to write the reset register, wait a few ticks, and then read the register until you saw a flag clear. reading the device "too soon" failed because it was in some poor state that didnt respond properly. with posted writes, you needed to force out the writes (not using read obviously) and then wait the appropriate time. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] questions about gen2 srp driver
In message <[EMAIL PROTECTED]>,Roland Dreier writes: >Yes, it's exactly because we know that work queues run in process >context with interrupts enabled which lets us use spin_lock_irq. thanks for the reply. you are quite right. i dont know what i was thinking. >There's no limitation on number of outstanding RDMAs targeting a >single R_Key. after looking at it further i finally see what the srp driver is doing. i didnt know that the rkey/lkey it gets during init applies to entire host memory. now, things make a little more sense. part of my confusion is that ->va really seems to mean physical address not virtual address. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] questions about gen2 srp driver
i have been looking at the srp driver in the gen2 trunk (and the version that is in the latest 2.6.15 kernels). i have a couple questions about its behavior and i am hoping someone can answer them. it seems to take scsi_host->host_lock with a spin_lock_irq() inside a couple of work queues. i believe work queues run at process context and not interrupt context. therefore, one should probably use spin_lock_irqsave()? secondly, there seems to be only one pair of lkeys/rkeys for a given srp "virtual" host. in srp_map_data() i see the rkey is assigned to the buffer: buf->key = cpu_to_be32(target->srp_host->mr->rkey); but the virtual host adapter template says: .can_queue = SRP_SQ_SIZE, .cmd_per_lun= SRP_SQ_SIZE, if there is only a single set of rdma keys how can the driver support more than one command (particularly on a target with multiple lun's) outstanding command? i didn't think the srp_post_send() was synchronus with respect to the completion of the current rdma request? ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general