Re: rdma_getaddrinfo and GUID
2015-07-23 22:46 GMT+03:00 Hefty, Sean sean.he...@intel.com: Ah, yes, the rdma_cm supports GIDs. It does not support GUIDs. Though it's usually trivial to convert a GUID into a GID. in guid you mean node guid ? So using node guied bring something like multipath (in case of able to via node guid get port guids) -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
rdma_getaddrinfo and GUID
Hello, does it possible to use rdma_getaddrinfo and specify in node port GUID? I'm try with fe80::::0002:c903:00ef:6651 this simple test: #include stdlib.h #include stdio.h #include rdma/rdma_cma.h #include infiniband/ib.h int main(int argc, char **argv) { struct rdma_addrinfo *hints, *info; int ret = -1; hints = malloc(sizeof(hints)); memset(hints, '\0', sizeof(hints)); hints-ai_flags = RAI_NUMERICHOST; hints-ai_family = AF_IB; // hints-ai_port_space = RDMA_PS_IB; ret = rdma_getaddrinfo(argv[1], NULL, hints, info); if (ret 0) { fprintf(stderr, %d rdma_getaddrinfo\n, ret); return ret; } if (info == NULL) { fprintf(stderr, %d rdma_getaddrinfo null\n, ret); return -1; } return 0; } But getting INET6 address Also noted that i don't enable IPoIB, why address not IF_IB family ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rdma_getaddrinfo and GUID
2015-07-19 0:15 GMT+03:00 Vasiliy Tolstov v.tols...@selfip.ru: But getting INET6 address Also noted that i don't enable IPoIB, why address not IF_IB family ? my mistake, i miss RAI_FAMILY append to flags ai_flags -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: multipath rdma
2015-07-09 14:07 GMT+03:00 Elad Raz el...@mellanox.com: The Multipath RDMA solution is still in alpha stage and sadly it’s not ready for upstream. Be sure that i’ll update you once it will Thanks for answer, does it possible to get access to it ?=) A want to compare it to mptcp -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
multipath rdma
Hello. I found the slides from presentation (by mellanox) about multipath rdma, does source code somewhere available? When mellanox provide more info and docs? Thanks! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: where defined IBV_RC
2015-03-13 18:26 GMT+03:00 Steve Wise sw...@opengridcomputing.com: Try IBV_QPT_RC from infiniband/verbs.h Also get the librdmacm source and there are examples of using this API. git://git.openfabrics.org/~shefty/librdmacm Thanks, does it possible to fix readme for rdma_getaddrinfo ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: where defined IBV_RC
2015-03-14 16:23 GMT+03:00 Vasiliy Tolstov v.tols...@selfip.ru: Try IBV_QPT_RC from infiniband/verbs.h Also get the librdmacm source and there are examples of using this API. git://git.openfabrics.org/~shefty/librdmacm Also does connect x3 supports xrc? and how do determine this support in runtime? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: where defined IBV_RC
2015-03-14 16:30 GMT+03:00 Vasiliy Tolstov v.tols...@selfip.ru: Also does connect x3 supports xrc? and how do determine this support in runtime? Also if i does not have ib cards e get unneeded errors like: Fatal: unable to get RDMA device list why library outputs to stderr? function already returns error code.. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
detect rdma capable devices
I have software that can operate via rdma and via ethernet, what is right way to detect rdma devices? For example if system have rdma device i'm try to use rdma_getaddrinfo, if no rdma drivers/devices - i'm switch to getaddrinfo... -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: detect rdma capable devices
2015-03-13 10:16 GMT+03:00 Vasiliy Tolstov v.tols...@selfip.ru: I have software that can operate via rdma and via ethernet, what is right way to detect rdma devices? For example if system have rdma device i'm try to use rdma_getaddrinfo, if no rdma drivers/devices - i'm switch to getaddrinfo... Now i'm try to get rdma_get_devices and check it for null, but does it right solution ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
where defined IBV_RC
I'm try to use rdma_getaddrinfo but it complains about IBV_RC error: ‘IBV_RC’ undeclared (first use in this function) rhints.ai_qp_type = IBV_RC; grepping headers does not helps. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2 port ib card rsocket and failover
Hello. Two years ago i used scst with multipath to provide block devices. Now i need to drop scst and go to software defined storage. In case of usage rsocket how i can implement failover of one path? Also if i resolve address via rsocket and server have two paths (two ports) which port is used for data ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does i need compat-rdma package
2014-12-24 12:03 GMT+03:00 Hiroyuki Sato hiroys...@gmail.com: Hello Vasilly Tolstov This URL might be help. http://www.rdmamojo.com There are RDMA configuration for many OSs. (RHEL, Ubuntu, SLES..) Thanks, but.. i'm using source based disto that not listed in this site =( -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does i need compat-rdma package
2014-12-24 11:19 GMT+03:00 Or Gerlitz ogerl...@mellanox.com: You can use the inbox libraries and install them through your distro package installer, with RHEL, Fedora and such it would be just $ yum groupinstall infiniband support to get you the set of required RPMS [1] SLES and Ubuntu should have similar means, no need to use OFED, compat, etc. Or. [1] you mentioned ConnectX-3, so your basic needs are libmlx4, libibverbs and librdmacm Thanks, last question - i'm try to use rsocket and AF_IB in my app, what minimal kernel version does support this? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does i need compat-rdma package
2014-12-24 14:38 GMT+03:00 Or Gerlitz ogerl...@mellanox.com: AF_IB was merged in 3.11, BTW - can you share what's your motivation to use it? as for rsockets, basically any kernel thatsupports the rdma_cm along with it's user-space device, this goes a while back... you mentioned 3.14, should be OK I'm try to modify sheepdog and qemu to allows via name name guid and establish communication with remote node without IPoIB. But my knowledge about rdma not very good, and i try to see sources and write simple app =). -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: does i need compat-rdma package
2014-12-24 15:29 GMT+03:00 Bart Van Assche bvanass...@acm.org: Hello Vasiliy, A possible alternative is to start from a recent upstream kernel and to build and install the latest RDMA libraries from the upstream git repositories. I have attached the script I use myself since considerable time to install the latest RDMA libraries on openSUSE systems. Nice! Thanks. I'm recheck my build (i'm already build all needed stuff). -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
does i need compat-rdma package
Hello. I want to develop some software under linux 3.14 and using infiniband libraries. I'm download latest ofed (3.12 ?) and see that some packages i don't need (i'm use only mellanox connect x3) But i can't understand in case of linux 3.14 does i need compat-rdma ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tutorials/howto to modify exiting tcp/ip app for rdma
2014-11-28 19:08 GMT+03:00 Hal Rosenstock h...@dev.mellanox.co.il: Also, you need librdmacm 1.0.19-1 or later for this. rstream is the best example of how to use AF_IB. THanks! As i understand i need kernel for example 3.14... =) and librdmacm 1.0.19 or later and use rsockets to replace tcp/ip. How about resolving? Sorry i'm try to take all pieces =). For example i see that rdma_getaddrinfo can resolve address, but as i see http://manpages.ubuntu.com/manpages/raring/man3/rdma_resolve_addr.3.html i need ipoib. In case of pure RDMA network without IP as i understand i need to check opensm sources? whether there is somewhere the minimum example how to work with guid of ports without use ipoib ? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tutorials/howto to modify exiting tcp/ip app for rdma
2014-11-23 23:41 GMT+03:00 Vasiliy Tolstov v.tols...@selfip.ru: Hello. I want to study ibverbs,rdma. I have application that use tcp/ip for transfer data. I want to add rdma support,avoiding usage IPoIB. So I want to write app that can find and connect to other nodes via guides. Does somebody share for me some links or how can I find it? Thanks Anybody can't help? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tutorials/howto to modify exiting tcp/ip app for rdma
2014-11-27 16:39 GMT+03:00 Sagi Grimberg sa...@dev.mellanox.co.il: You can keep doing IP address resolution using rdma_cm. You can see perftest examples in git://flatbed.openfabrics.org/~grockah/perftest.git Or, librdmacm's rping in git://flatbed.openfabrics.org/~shefty/librdmacm.git Thanks! But if don't have ipoib connection why i need ip address ?=) -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Tutorials/howto to modify exiting tcp/ip app for rdma
2014-11-27 15:28 GMT+03:00 Rupert Dance rsda...@soft-forge.com: Vasiliy, Please review the OpenFabrics Alliance site and consider posting this question to the OFA User group. OFA: https://www.openfabrics.org/index.php User Group: http://lists.openfabrics.org/mailman/listinfo/users OFS User Portal: https://www.openfabrics.org/index.php/ofs-users.html Thanks Thanks, i'm try. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tutorials/howto to modify exiting tcp/ip app for rdma
Hello. I want to study ibverbs,rdma. I have application that use tcp/ip for transfer data. I want to add rdma support,avoiding usage IPoIB. So I want to write app that can find and connect to other nodes via guides. Does somebody share for me some links or how can I find it? Thanks -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
debug sometimes slow performance
Hi all. I'm very happy with using srpt disk via ib_srp (linux 3.10). Storage used by virtual machines (xen), target side (2 servers): Software raid 10 with lvm on top of it. scst.conf: HANDLER vdisk_fileio { DEVICE sas00_md127 { filename /dev/md127 nv_cache 1 } } TARGET_DRIVER ib_srpt { TARGET ib_srpt_target_0 { enabled 1 io_grouping_type this_group_only rel_tgt_id 1 LUN 0 sas00_md127 } } LVM (pv /dev/md127) exported to initiator node (xen). On initiator node from both servers disks are goes to multipath. On this pv i have many lv for each virtual machine. For each vps i'm assemble raid1 from 2 lv (lv from different target nodes). Sometimes all works fine, sometimes not. Which sysfs/procfs entries i can check on initiator and target side to determine bottleneck? I'm try to run blktrace to md device provide for vps, but not get needed info. Linux 3.10 SCST v3.0.0-pre2 (svn 4973) Debian 7 Can somebody helps me? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-03-01 16:50 GMT+04:00 Bart Van Assche bvanass...@acm.org: I'm not sure but I think there is a fix in kernel 3.11 for this issue. Thanks, but as i understand, 3.10 lts and need to get this updates.. But thanks! I'm try to search changelogs for 3.11. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-03-01 20:55 GMT+04:00 Vasiliy Tolstov v.tols...@selfip.ru: Thanks, but as i understand, 3.10 lts and need to get this updates.. But thanks! I'm try to search changelogs for 3.11. Hmm. May me i miss something, but i can't see in changelogs for 3.11 (all) fixes for this error. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-03-01 21:23 GMT+04:00 Bart Van Assche bvanass...@acm.org: I have seen this issue myself with the 3.0 mlx4_en driver but do not see it anymore with the latest version of this driver. I do not know which patch fixes this issue. But I have been wondering whether it could have been the following patch: commit 0cc5c8bf11852dec3225fda2f53a599243095d23 net/mlx4_en: Fix a race between napi poll function and RX ring cleanup The RX rings were cleaned while there was still possible RX traffic completion handling. Change the sequence of events so that the port is closed and the QPs are being stopped before RX cleanup. Thanks! I'm try it. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-02-04 18:19 GMT+04:00 Vasiliy Tolstov v.tols...@selfip.ru: 2014-02-04 Vasiliy Tolstov v.tols...@selfip.ru: This is pretty much very old firmware, could you update it please with the latest GA from the mellanox site and see if things smile back? also send the lspci | grep -i Mellanox I'm update firmware to 2.7.200 (that is the latest from ftp supermicro). Now i'm try to test it. Sorry for bumping old thread. Now i again have this error: ibstat: CA 'mlx4_0' CA type: MT26428 Number of ports: 1 Firmware version: 2.7.5558 Hardware version: a0 Node GUID: 0x0030489dc9d8 System image GUID: 0x0030489dc9db Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 7 LMC: 0 SM lid: 6 Capability mask: 0x02510868 dmesg: [1067167.780266] mlx4_ib destroy_qp_common: modify QP 00110e to RESET failed. [1068708.243893] mlx4_ib destroy_qp_common: modify QP 001118 to RESET failed. [1174994.897394] mlx4_core :03:00.0: Internal error detected: [1174994.897484] mlx4_core :03:00.0: buf[00]: 001805a5 [1174994.897516] mlx4_core :03:00.0: buf[01]: [1174994.897544] mlx4_core :03:00.0: buf[02]: 200715b6 [1174994.897577] mlx4_core :03:00.0: buf[03]: [1174994.897607] mlx4_core :03:00.0: buf[04]: 0018050c [1174994.897637] mlx4_core :03:00.0: buf[05]: 0001 [1174994.897666] mlx4_core :03:00.0: buf[06]: 2fd8 [1174994.897697] mlx4_core :03:00.0: buf[07]: 0084 [1174994.897727] mlx4_core :03:00.0: buf[08]: db9f [1174994.897756] mlx4_core :03:00.0: buf[09]: 4000 [1174994.897786] mlx4_core :03:00.0: buf[0a]: [1174994.897815] mlx4_core :03:00.0: buf[0b]: [1174994.897845] mlx4_core :03:00.0: buf[0c]: [1174994.897874] mlx4_core :03:00.0: buf[0d]: [1174994.897904] mlx4_core :03:00.0: buf[0e]: [1174994.897934] mlx4_core :03:00.0: buf[0f]: [1174994.897963] mlx4_en :03:00.0: Internal error detected, restarting device [1175000.294830] mlx4_ib destroy_qp_common: modify QP 001eca to RESET failed. [1175000.352029] bonding: bond1: releasing active interface ib0 [1175000.352072] bonding: bond1: Warning: clearing HW address of bond1 while it still has VLANs. [1175000.352182] bonding: bond1: When re-adding slaves, make sure the bond's HW address matches its VLANs'. [1175000.352517] bonding: bond1: destroying bond bond1. [1175000.354034] bonding: bond1: released all slaves [1175000.354694] ib0: ib_cq_destroy (recv) failed [1175000.354721] ib0: ib_destroy_srq failed: -16 [1175000.355780] ib0: ib_dealloc_pd failed -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-02-28 22:24 GMT+04:00 Vasiliy Tolstov v.tols...@selfip.ru: This is pretty much very old firmware, could you update it please with the latest GA from the mellanox site and see if things smile back? also send the lspci | grep -i Mellanox And after looking dmesg i see sometime early: [271701.934646] device ib0 left promiscuous mode [292670.940271] [ cut here ] [292670.940309] WARNING: at /tmp/buildd/linux-3.10.11/net/sched/sch_generic.c:255 dev_watchdog+0xe3/0x153() [292670.940359] NETDEV WATCHDOG: ib0 (mlx4_core): transmit queue 0 timed out [292670.940388] Modules linked in: tcp_diag inet_diag ip_set_hash_net xt_set ip_set nfnetlink xt_tcpudp xt_pkttype ip6table_mangle ip6table_raw iptable_mangle iptable_raw ip6table_filter ip6_tables iptable_filter ip_tables 8021q garp stp mrp llc bonding ib_ucm ib_uverbs ib_addr ib_umad ib_ipoib ib_srp scsi_transport_srp ib_cm scsi_tgt mlx 4_ib ib_sa ib_mad ib_core mlx4_en ipmi_si ipmi_devintf ipmi_msghandler ipt_NETFLOW(O) x_tables dm_mod md_mod iTCO_wdt coretemp acpi_cpufreq i7core_edac snd_pcm mperf iTCO _vendor_support snd_page_alloc snd_timer edac_core snd soundcore psmouse kvm_intel i2c_i801 ioatdma pcspkr serio_raw kvm lpc_ich evdev joydev mfd_core crc32c_intel proces sor button thermal_sys squashfs loop aufs(C) hid_generic usbhid hid ata_generic ata_piix microcode uhci_hcd ehci_pci mpt2sas libata ehci_hcd raid_class scsi_transport_sas usbcore usb_common scsi_mod mlx4_core igb i2c_algo_bit i2c_core dca ptp pps_core [last unloaded: nf_conntrack] [292670.940991] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G CIO 3.10-3-amd64 #1 Debian 3.10.11-3+0~20131204153435.21+wheezy~1.gbpa177bd [292670.941045] Hardware name: Supermicro B8DT6/B8DT6, BIOS 080015 03/03/2010 [292670.941075] 8138d323 8103be31 8801bfc0dc00 [292670.941130] 8801bfc03e08 8801b8688000 0100 [292670.941185] 812e5e78 8801b8688348 8103bee1 8153628f [292670.941240] Call Trace: [292670.941262] IRQ [8138d323] ? dump_stack+0xd/0x17 [292670.941303] [8103be31] ? warn_slowpath_common+0x5f/0x77 [292670.941334] [812e5e78] ? netif_tx_lock+0x7a/0x7a [292670.941364] [8103bee1] ? warn_slowpath_fmt+0x45/0x4a [292670.941394] [812e5e65] ? netif_tx_lock+0x67/0x7a [292670.941426] [812e5f5b] ? dev_watchdog+0xe3/0x153 [292670.941458] [810471ee] ? call_timer_fn+0x4b/0xf6 [292670.941488] [812e5e78] ? netif_tx_lock+0x7a/0x7a [292670.941518] [810482fe] ? run_timer_softirq+0x18d/0x1d6 [292670.941548] [81042576] ? __do_softirq+0xec/0x209 [292670.941578] [8104275e] ? irq_exit+0x3f/0x83 [292670.941608] [8100e6f3] ? do_IRQ+0x81/0x97 [292670.941639] [81390a6d] ? common_interrupt+0x6d/0x6d [292670.941666] EOI [812a70ec] ? arch_local_irq_enable+0x4/0x8 [292670.941708] [812a74af] ? cpuidle_enter_state+0x46/0xb1 [292670.941739] [812a75f0] ? cpuidle_idle_call+0xd6/0x147 [292670.941770] [81013add] ? arch_cpu_idle+0x6/0x1a [292670.941802] [810734d3] ? cpu_startup_entry+0x114/0x191 [292670.941835] [816b3d51] ? start_kernel+0x3e8/0x3f3 [292670.941864] [816b377f] ? repair_env_string+0x57/0x57 [292670.941895] [816b359a] ? x86_64_start_kernel+0xf2/0xfd [292670.941924] ---[ end trace 119aa908393ed9b3 ]--- [292670.941950] ib0: transmit timeout: latency 1356 msecs [292670.941977] ib0: queue stopped 1, tx_head 595756, tx_tail 595756 -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] SRP initiator patches for kernel 3.15
2014-02-20 14:50 GMT+04:00 Bart Van Assche bvanass...@acm.org: This patch series includes the following six patches: 0001-scsi_transport_srp-Fix-two-kernel-doc-warnings.patch 0002-IB-srp-Add-more-logging.patch 0003-IB-srp-Fail-SCSI-commands-silently.patch 0004-IB-srp-Avoid-duplicate-connections.patch 0005-IB-srp-Make-writing-into-the-add_target-sysfs-attrib.patch 0006-IB-srp-Avoid-that-writing-into-add_target-hangs-due-.patch Is that possible to get some of this into stable 3.10.x? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-02-04 Vasiliy Tolstov v.tols...@selfip.ru: This is pretty much very old firmware, could you update it please with the latest GA from the mellanox site and see if things smile back? also send the lspci | grep -i Mellanox I'm update firmware to 2.7.200 (that is the latest from ftp supermicro). Now i'm try to test it. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
2014-02-03 Or Gerlitz or.gerl...@gmail.com: This is pretty much very old firmware, could you update it please with the latest GA from the mellanox site and see if things smile back? also send the lspci | grep -i Mellanox Thanks. I'm try to connect with supermicro team, because blade from supermicro and mellanox mftlint can't upgrade firmware =(. lspci output: 03:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0) -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Need help with strange mlx4_core error
Hi all. After switching to kernel 3.10 i get sometimes errors in dmesg like this: [Sun Jan 26 09:58:50 2014] mlx4_ib destroy_qp_common: modify QP 007def to RESET failed. [Sun Jan 26 20:27:52 2014] mlx4_ib destroy_qp_common: modify QP 0221ca to RESET failed. [Mon Jan 27 03:44:20 2014] mlx4_ib destroy_qp_common: modify QP 0232ad to RESET failed. [Mon Jan 27 14:23:25 2014] mlx4_core :03:00.0: command 0x19 failed: fw status = 0x9 [Mon Jan 27 14:23:25 2014] ib0: failed to modify QP to INIT: -9 [Mon Jan 27 16:37:00 2014] mlx4_ib destroy_qp_common: modify QP 00258a to RESET failed. [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: Internal error detected: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[00]: 001805a5 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[01]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[02]: 20060384 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[03]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[04]: 0018050c [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[05]: 0001 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[06]: 2cd4 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[07]: 0084 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[08]: f8af [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[09]: 4000 [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0a]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0b]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0c]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0d]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0e]: [Mon Jan 27 16:58:26 2014] mlx4_core :03:00.0: buf[0f]: [Mon Jan 27 16:58:26 2014] mlx4_en :03:00.0: Internal error detected, restarting device [Mon Jan 27 16:58:35 2014] mlx4_ib destroy_qp_common: modify QP 002cd4 to RESET failed. [Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet [Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet [Mon Jan 27 16:58:35 2014] ib0: dev_queue_xmit failed to requeue packet [Mon Jan 27 16:58:36 2014] mlx4_core: Initializing :03:00.0 [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 59 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 60 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 61 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 62 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 63 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 64 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 65 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 66 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 67 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 68 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 69 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 70 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: irq 71 for MSI/MSI-X [Mon Jan 27 16:58:38 2014] mlx4_core :03:00.0: command 0xc failed: fw status = 0x40 [Mon Jan 27 16:58:38 2014] mlx4_en :03:00.0: UDP RSS is not supported on this device. I don't know what traffic can trigger this (i'm using IPoIB with connected mode) but i think this can happening then someone send massive udp traffic. What can i do to fix this issue? When error appears ib0 device (IPoIB) goes to down. Very big thanks for all help. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Need help with strange mlx4_core error
] mlx4_core :03:00.0: 4530440 KB of HCA context requires KB aux memory. [583791.348513] mlx4_core :03:00.0: Mapped 38 chunks/ KB for ICM aux. [583791.349649] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 0 for ICM. [583791.350756] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 4000 for ICM. [583791.351862] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 8000 for ICM. [583791.351894] mlx4_core :03:00.0: Mapped 1 chunks/4 KB at c000 for ICM. [583791.351942] mlx4_core :03:00.0: Mapped 1 chunks/8 KB at 11484 for ICM. [583791.353028] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 10c00 for ICM. [583791.354151] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11000 for ICM. [583791.355235] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 10800 for ICM. [583791.356319] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11480 for ICM. [583791.357404] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11200 for ICM. [583791.358489] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1 for ICM. [583791.359571] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11380 for ICM. [583791.360657] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11300 for ICM. [583791.361741] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11400 for ICM. [583791.362824] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11404 for ICM. [583791.363907] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11408 for ICM. [583791.364991] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1140c for ICM. [583791.366074] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11410 for ICM. [583791.367158] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11414 for ICM. [583791.368240] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11418 for ICM. [583791.369324] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1141c for ICM. [583791.370409] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11420 for ICM. [583791.371491] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11424 for ICM. [583791.372574] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11428 for ICM. [583791.373657] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1142c for ICM. [583791.374741] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11430 for ICM. [583791.375823] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11434 for ICM. [583791.376908] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11438 for ICM. [583791.377992] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1143c for ICM. [583791.379073] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11440 for ICM. [583791.380158] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11444 for ICM. [583791.381242] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11448 for ICM. [583791.382326] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1144c for ICM. [583791.383408] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11450 for ICM. [583791.384493] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11454 for ICM. [583791.385577] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11458 for ICM. [583791.386659] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1145c for ICM. [583791.387742] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11460 for ICM. [583791.388824] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11464 for ICM. [583791.389909] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11468 for ICM. [583791.390992] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1146c for ICM. [583791.392076] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11470 for ICM. [583791.393159] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11474 for ICM. [583791.394243] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 11478 for ICM. [583791.395325] mlx4_core :03:00.0: Mapped 1 chunks/256 KB at 1147c for ICM. [583791.905303] mlx4_core :03:00.0: irq 59 for MSI/MSI-X [583791.905313] mlx4_core :03:00.0: irq 60 for MSI/MSI-X [583791.905319] mlx4_core :03:00.0: irq 61 for MSI/MSI-X [583791.905327] mlx4_core :03:00.0: irq 62 for MSI/MSI-X [583791.905333] mlx4_core :03:00.0: irq 63 for MSI/MSI-X [583791.905340] mlx4_core :03:00.0: irq 64 for MSI/MSI-X [583791.905346] mlx4_core :03:00.0: irq 65 for MSI/MSI-X [583791.905354] mlx4_core :03:00.0: irq 66 for MSI/MSI-X [583791.905360] mlx4_core :03:00.0: irq 67 for MSI/MSI-X [583791.905366] mlx4_core :03:00.0: irq 68 for MSI/MSI-X [583791.905373] mlx4_core :03:00.0: irq 69 for MSI/MSI-X [583791.905379] mlx4_core :03:00.0: irq 70 for MSI/MSI-X [583791.905386] mlx4_core :03:00.0: irq 71 for MSI/MSI-X [583791.970917] mlx4_core :03:00.0: NOP command IRQ test passed [583791.971212] mlx4_core :03:00.0: command 0xc failed: fw status = 0x40 -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from
default mtu sizes for connected/datagram modes and driver versions
Hi all. I'm try to understand mtu sizes in case of datagram/connected modes. I have some old debian systemd and its ip over ib interfaces has mtu 1500, on never systemd i have 4092, why? And in case of datagram modes i have 2042.. what is the best values for mtu in case of using ip over ib? Thanks! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
memory sinhronization and replication
Hi all. Does anybody knowns how create using rdma driver to share memory from two servers? For example - i need memory from srv1 copied to srv2 and srv2 to srv1 ? If nobody knowns, can somebody share to me info/link for docs that can explain me that function/ subsystems i need to know/use to create such thing? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: memory sinhronization and replication
2013/12/24 Anuj Kalia anujkaliai...@gmail.com: I am not sure if I understand your question correctly. Copying data between two machines is straightforward with RDMA: the verbs operations do exactly that. Thanks. I need something like this: Two servers write data to some storage (sata disks). On OS level i have disk cache read/write, that i need to have on both servers identical for failover. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mellanox VPI
2013/10/6 Or Gerlitz ogerl...@mellanox.com: # lspci | grep Mell 06:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) 07:00.0 Network controller: Mellanox Technologies MT27520 Family # echo ib /sys/bus/pci/devices/\:06\:00.0/mlx4_port1 # echo eth /sys/bus/pci/devices/\:06\:00.0/mlx4_port2 Thanks, but as i see this car have two ports. How about eth and ib on the same port in the same time? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mellanox VPI
2013/10/6 Jack Morgenstein ja...@dev.mellanox.co.il: You cannot have ETH and IB link layers existing on the same port at the same time. However, you can run IB applications over an ETH link layer by using RoCE (RDMA over Converged Ethernet). If you do: ibv_devinfo on a host where an HCA has a port configured to an ETH link layer, you should see an entry for the ETH layer port. Under that port number, you will see: ... link_layer: Ethernet Thanks, Jack, for answer. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mellanox VPI
Hi All. I'm interesting of mellanox ConnectX-3 cards that support VPI. As a read from http://www.openfabrics.org/downloads/OFED/ofed-1.4/OFED-1.4-docs/mlx4_release_notes.txt VPI provide ability to autodetect link type or control which type to use. But does it possible to have on one port 10Gb ethernet and infiniband together? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: recommend setting for dev_loss_tmo and fast_io_fail_tmo
2013/6/14 Bart Van Assche bvanass...@acm.org: Hello Vasiliy, Does this mean that you are not subscribed to the linux-rdma mailing list ? A patch series that allows faster reconnects and that allows to disable dev_loss_tmo is currently under review. See e.g. http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg15628.html. I'm subscribed, but i have high volume e-mails from lkml,scst,libvirt, etc and may missed some emails when reading.. Thanks for link! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
recommend setting for dev_loss_tmo and fast_io_fail_tmo
Hello. I'm using 3.9.5 and ib_srp_backport from github. What is recommended setting for dev_loss_tmo and fast_io_fail_tmo ? Now i have dev_loss_tmo = 60 fast_io_fail_tmo = 40 Does it right? I need to wait no more 100 seconds for failed path. (i'm use some xen vm on this host and it linux kernels not like to stuck more than 120 seconds =)) -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: recommend setting for dev_loss_tmo and fast_io_fail_tmo
2013/6/13 Bart Van Assche bvanass...@acm.org: Hello Vasily, Hello, Bart. The default value for the reconnect_delay parameter in that version of the SRP initiator is 10 seconds. So if you want to give the SRP initiator a chance to try once to reconnect before failing I/O fast you can set the fast_io_fail_tmo parameter to e.g. 15 seconds. That should limit the time during which I/O is stuck to about 76 seconds (61 seconds IB RC timeout + 15 seconds for the fast I/O fail timeout). Hm. Does it mean that if i set fast_io_fail_tmo = 40 i get timeout after 61+40 = 101 seconds? How to configure dev_loss_tmo depends on the configuration at the initiator side. If you are using initiator-side mirroring then it is a good idea to set dev_loss_tmo to a very large value in order to avoid /dev/sd* reassignment. Note: with the patch series I posted earlier this week it is possible to set dev_loss_tmo to off, something that is not yet possible with the ib_srp-backport project. Yes, i'm using raid1 on initiator side. That means that in my case dev_loss_tmo may be switched off? P.S. In case of kernel 3.9.5 does i need you backported ib_srp driver or i can use mainline kernel drivers for faster reconntects? P.P.S. Can you provide me subject of e-mail or link to patches to switch off dev_loss_tmo? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to do replication right with SRP or remote storage?
2013/6/10 Sebastian Riemer sebastian.rie...@profitbricks.com: I can also recommend you Vasiliy Tolstov v.tols...@selfip.ru. He also uses SRP with MD RAID-1. He could convince Neil to fix the MD data offet. OpenSource is all about the right allies, Thanks for recommendations... but --data-oofset already fixed. Sometime ago Neil add this option when doing operations with raid. But when i'm try it it does not works. After my message in LKML and MDADM developers list Neil says, that it can be fixed. As i see http://git.neil.brown.name/?p=mdadm.git;a=shortlog in May this has been fixed. May be we need to test this. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches
2013/5/15 Bart Van Assche bvanass...@acm.org: The traditional approach to block access from a specific initiator is to modify the LUN masking configuration at the target side dynamically. More information about LUN masking can be found in the scst.conf man page and in the srpt/README file. I will make a patch available for ib_srpt that allows to close a single session at a time. Thanks for answers and Thanks for patch, i'm try it. -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches
if i need faster reconnects and ability to close session from initiator side under qlogic hardware, does it possible? Or this patches only covers mallanox cards? 2013/5/8 Sebastian Riemer sebastian.rie...@profitbricks.com: FYI: I've released version 0.6 of my SRP patches today. The automatic reconnect is included now. The tests for that will follow in the next version. But we already did quite intensive testing for that. Hard reboot and also soft reboot of the target are possible with that reconnect. It just reconnects and everything is fine again. With soft reboot I mean: disabling the target, removing the exports, rebooting, exporting the same LUNs, re-enabling the target. It also has an automatic mechanism to reduce the possibility of a DDoS attack reconnect. It automatically reconnects at different intervals. Check it out: https://github.com/sriemer/ib_srp Cheers, Sebastian -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tune ib stack
Sorry for bumping old thread, i'm solve my problems with new firmware. I have supermicro servers that rebrand mellanox firmware (recompile and change some bits) Now all works fine i have 40 gb/s QDR instead of 10 Gb/s 2013/4/9 Sebastian Riemer sebastian.rie...@profitbricks.com: On 09.04.2013 16:23, Hal Rosenstock wrote: So these values are exactly the same as in ibv_devinfo and can be set in /sys/class/infiniband/mlx4_0/device/mlx4_port1_mtu. I've found the PortInfo with the command smpquery portinfo -C mlx4_0 3 1 where I'm using the first HCA to contact the SM. I tell the SM the destination LID ('3' here in my case) and the destination port ('1'). Is there another method to set the max MTU? That doesn't set max MTU (MTUCap) but merely reads it (for that port). Sorry, copy and paste error. I've meant the mlx4 file: /sys/class/infiniband/mlx4_0/device/mlx4_port1_mtu But you've answered that by vendor specific. Thanks for the valuable information! For us most interesting would be if the MTU can be changed live without any service disruption. Looks like the mlx4 driver can't provide that. Perhaps switches can do that. Cheers, Sebastian -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches
2013/5/14 Bart Van Assche bvanass...@acm.org: The ability to close a session from the initiator side went upstream in kernel 3.8 (/sys/class/srp_remote_ports/port-h:n/delete). Regarding faster reconnects: please keep in mind that after a cable pull it can easily takes 20 seconds before link training and initialization by the subnet manager have finished. It's not possible to make an initiator reconnect in less time than what the hardware and subnet manager need to bring the link back. Thanks. What about close session from target side? For example i need to close the srp session and block all access from specific initiator? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: tune ib stack
2013/4/9 Sebastian Riemer sebastian.rie...@profitbricks.com: Because 2048 is the default and 4096 is the max. supported MTU by the hardware. How can i set active mtu? Something like this: echo 4096 /sys/class/infiniband/mlx4_0/device/mlx4_port1_mtu After doing this all srp connections down and port is down. I need to restart openibd 06:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] Subsystem: Mellanox Technologies Device 0017 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 42 Region 0: Memory at df90 (64-bit, non-prefetchable) [size=1M] Region 2: Memory at de00 (64-bit, prefetchable) [size=8M] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Vital Product Data Not readable Capabilities: [9c] MSI-X: Enable+ Count=128 Masked- Vector table: BAR=0 offset=0007c000 PBA: BAR=0 offset=0007d000 Capabilities: [60] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 64ns, L1 unlimited ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [148 v1] Device Serial Number 00-25-90-ff-ff-17-9b-24 Capabilities: [18c v1] #19 Kernel driver in use: mlx4_core Kernel modules: mlx4_core Could be a bug. Which OFED/Kernel (if using in-tree IB modules) do you use? Mine says with ConnectX2 QDR: 40 Gb/sec (4X QDR) I'm using stock 3.8.6 kernel and xen patches on top. And i'm use modules provided with kernel. (only ib_srp i'm use from Bart github repo) You should see 40 Gb/sec (4X QDR) here. Perhaps the OFED is too old so that FDR and ConnectX 3 aren't supported, yet. 10 Gb/sec (4X) seems to be the default case if a rate isn't supported. Yes, in older card with ConnecX i see this, but in case of ConnectX-3 only 10 Gb -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
linux 3.8.6 and srp backports
Hello. Some times ago, when i'm use kernel 3.6 i'm use https://github.com/bvanassche/ib_srp-backport/ for srp drivers on my linux server. Now i'm using 3.8.6, does i need something from https://github.com/bvanassche/ib_srp-backport/ or all patches already applied in upstream? -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: linux 3.8.6 and srp backports
2013/4/8 Bart Van Assche bvanass...@acm.org: Unfortunately not all patches present in the ib_srp-backport project are already upstream. If you want to avoid that in a multipath setup sooner or later failover or failback fails with a SCSI device in the offline state you will still need the ib_srp-backport project. I'm currently working on converting that project into a new patch series intended for inclusion in the mainline kernel. Thanks. I'm already check and approve that i need you github branch =). Thanks for great work! -- Vasiliy Tolstov, e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] OFED-3.5-rc4 is available
Hello. Does this release production ready for heavy load? Or now best to use ofed 1.5.x? 2013/1/3 Vladimir Sokolovsky v...@dev.mellanox.co.il: Hi, OFED 3.5-rc4 is available. The tarball is available on: http://www.openfabrics.org/downloads/OFED/ofed-3.5/OFED-3.5-rc4.tgz To get BUILD_ID run ofed_info Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 3.5 OFED-3.5-rc4 Main Changes from OFED 3.5-rc3 --- RDMA/nes: Fixes for OFED-3.5 RC3 RDMA/cxgb4: Keep the maximum number of stag limited to T4_MAX_NUM_STAG Updated packages: infinipath-psm-3.1-364.1140_open Regards, Vladimir -- Vasiliy Tolstov, Clodo.ru e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: srp-ha backport
Hello, again. Now i'm switch from sles kernel to 3.6.7 All works fine , but now you patches from github provide some errors: /sbin/service openibd restart Unloading ib_srp [FAILED] Removing 'ib_srp': Device or resource busy xen11:~ # rmmod ib_srp ERROR: Removing 'ib_srp': Device or resource busy xen11:~ # lsmod | grep srp ib_srp 47710 0 [permanent] ib_cm 46778 2 rdma_cm,ib_srp ib_sa 33627 4 rdma_ucm,rdma_cm,ib_srp,ib_cm ib_core82311 9 rdma_ucm,rdma_cm,iw_cm,ib_srp,ib_cm,ib_sa,ib_uverbs,ib_umad,ib_mad How can i solve this? 2012/11/23 Bart Van Assche bvanass...@acm.org: On 11/23/12 07:53, Vasiliy Tolstov wrote: Is that possible to backport needed patch to sles11 sp2 (i can't switch kernel now becouse i'm using xen on initiator node and need recompile many different packages for new kernel) In every Linux distribution I know the SCSI core is not a kernel module but is built into the kernel itself. For SLES that means that only Novell can backport SCSI core patches to SLES. Bart. -- Vasiliy Tolstov, Clodo.ru e-mail: v.tols...@selfip.ru jabber: v...@selfip.ru -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: srp-ha backport
Bart Van Assche bvanassche@... writes: Hello, In case anyone would like to start using the srp-ha patch series before it gets upstream, a backported version of that patch series is available here: http://github.com/bvanassche/ib_srp-backport. The advantages of that version of ib_srp over what's upstream are: - Better robustness against cable plugging. - Allows closing an SRP connection from the initiator side (via the new delete attribute in sysfs). - Configurable dev_loss_tmo and fast_io_fail_tmo parameters. - Builds against any kernel in the range 2.6.32..3.6. - Can be used on RHEL 6.x systems. In combination with srp_daemon and multipath-tools this should allow to build a reliable H.A. SRP solution. Note: I haven't been able to test that code against every existing mainstream or distro kernel. Feedback is welcome though. Thanks for this backport! I have some problem under sles 11 sp2 (kernel 3.0.42- 0.7-xen) then i shutdown srp target (reboot one sas server) multipath -ll does not respond. If i provide in multipath and srp identical dev_loss_tmo and fast_io_fail_tmo nothing changed. multipath -ll unblocks only then the server goes up. dev_loss_tmo = 15 fast_io_fail_tmo = 10 multipath.conf defaults { getuid_callout /bin/cat /sys/block/%n/device/model path_grouping_policy failover failback immediate no_path_retryfail path_checker tur rr_weightuniform rr_min_io100 polling_interval 10 checker_timeout 10 fast_io_fail_tmo 60 dev_loss_tmo 120 } blacklist { devnode cciss devnode fd devnode hd devnode md devnode sr devnode scd devnode st devnode ram devnode raw devnode loop } -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html