On Sat, Jun 6, 2009 at 7:27 PM, Chris Worley <[email protected]> wrote: > > On Sat, Jun 6, 2009 at 1:36 AM, Bart Van Assche > <[email protected]> wrote: > > On Sat, Jun 6, 2009 at 1:15 AM, Chris Worley<[email protected]> wrote: > >> Setup: 1.4.1 w/ 3 dual-port QDR cards in each of two hosts, all ports > >> direct connected, opensm running on all port GUIDs from one host, all > >> links active. > >> > >> Problem: ibsrpdm only advertises the first port of the first HCA of the > >> target. > >> Next problem: I can add targets via > >> /sys/class/infiniband_srp/srp-*/add_target on the initiator, but only > >> when naming the two port guids of the first HCA on the target. In > >> testing, both ports are used. > >> > >> Can somebody aim me in the right direction of what/who's stopping > >> after the first HCA? > > > > Please have a look at the /sys/class/infiniband_srpt/srpt-*/login_info > > information on the target. The following information should be > > present: > > * One /sys/class/infiniband_srpt/srpt-* entry per HCA. > > * For each HCA, /sys/class/infiniband_srpt/srpt-${HCA}/login_info > > should contain one line for each port of that HCA. > > # cat /sys/class/infiniband_srpt/srpt-*/login_info > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000024710000000041,service_id=0024710000000040 > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000024710000000042,service_id=0024710000000040 > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000024710000000045,service_id=0024710000000040 > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000024710000000046,service_id=0024710000000040 > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000002c903000292af,service_id=0024710000000040 > tid_ext=0024710000000040,ioc_guid=0024710000000040,pkey=ffff,dgid=fe800000000000000002c903000292b0,service_id=0024710000000040 > > Each port has an entry, and the port GUIDs are correct (dgid's), but > the rest of the GUIDs refer to the node GUID of the first IB HCA: > 0024710000000040. > > Is that expected?
Yes. The ioc_guid in the above output is a GUID that identifies the SRP target. A quote from the ib_srpt source code: /* * We do not have a consistent service_id (ie. also id_ext of target_id) * to identify this target. We currently use the guid of the first HCA * in the system as service_id; therefore, the target_id will change * if this HCA is gone bad and replaced by different HCA. */ I'm not sure however what ibsrpdm should display -- I don't know whether it should display one single set of login parameters or all possible login parameters. > > On the initiator you can use the information obtained from > > "login_info" (after having replaced tid_ext by id_ext) to log in to > > the target: > > echo ... > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target > > Using the first HCA's node GUIDs from my target adds on the initiator > seems to work, but soon after (and not doing anything w/ the devices) > the system panic'd (and remote power cycling is not working). It > doesn't look like the panic was anywhere in IB or SRP modules: > [ ... ] That's bad news. Anyway, if the kernel on the initiator system crashes, that's a bug in the kernel of the initiator system. I hope that this can be resolved through a support contract. If not, I'm afraid that you will have to experiment with kernel versions and OFED versions in order to find a combination that works. Bart. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
