Hello Eric, [RESEND: the time it took the VMFS was created was switched between on/off so with on it took over 2 minutes with off it took less than 4 seconds]
[RESEND 2: The throughput graphs were switched as well ;-(] > * Thomas Glanzmann <tho...@glanzmann.de> [2014-02-07 08:55]: > > Creating a 4 TB VMFS filesystem over iSCSI takes 24 seconds on 3.12 > > and 15 minutes on 3.14.0-rc2+. * Nicholas A. Bellinger <n...@linux-iscsi.org> [2014-02-07 20:30]: > Would it be possible to try a couple of different stable kernel > versions to help track this down? I bisected[1] it and found the offending commit f54b311 tcp auto corking [2] 'if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data.' - Description by David S. Miller I gathered a pcap with tcp_autocorking on and off. On: - took 2 minutes 24 seconds to create a 500 GB VMFS file system https://thomas.glanzmann.de/tmp/tcp_auto_corking_on.pcap.bz2 https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-09:46:34.png https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-09:52:28.png Off: - took 4 seconds to create a 500 GB VMFS file system sysctl net.ipv4.tcp_autocorking=0 https://thomas.glanzmann.de/tmp/tcp_auto_corking_off.pcap.bz2 https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-09:45:43.png https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-09:53:17.png First graph can be generated by opening bunziping the file, opening it in wireshark and select Statistics > IO Grap and change the unit to Bytes/Tick. The second graph can be generated by selecting Statistics > TCP Stream Graph > Round Trip Time. You can also see that the round trip time increases by factor 25 at least. I once saw a similar problem with dealyed ACK packets of the paravirtulized network driver in xen it caused that the tcp window filled up and slowed down the throughput from 30 MB/s to less than 100 KB/s the symptom was that the login to a Windows desktop took more than 10 minutes while it used to be below 30 seconds because the profile of the user was loaded slowly from a CIFS server. At that time the culprit were also delayed small packets: ACK packets in the CIFS case. However I only proofed iSCSI regression so far for tcp auto corking but assume we will see many others if we leave it enabled. I found the problem by doing the following: - I compiled kernel by executing the following commands: yes '' | make oldconfig time make -j 24 / make modules_install / mkinitramfs -o /boot/initrd.img-bisect <version> - I cleaned the iSCSI configuration after each test by issuing: /etc/init.d/target stop rm /iscsi?/* /etc/target/* - I configured iSCSI after each reboot cat > lio-v101.conf <<EOF set global auto_cd_after_create=false /backstores/fileio create shared-01.v101.campusvl.de /iscsi1/shared-01.v101.campusvl.de size=500G buffered=true /iscsi create iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/portals create 10.101.99.4 /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/portals create 10.101.99.5 /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/portals create 10.102.99.4 /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/portals create 10.102.99.5 /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/luns create /backstores/fileio/shared-01.v101.campusvl.de lun=10 /iscsi/iqn.2013-03.de.campusvl.v101.storage:shared-01.v101.campusvl.de/tpgt1/ set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 cache_dynamic_acls=1 saveconfig yes EOF targetcli < lio-v101.conf And configured a fresh booted ESXi 5.5.0 1331820 via autodeploy to the iSCSI target, configured the portal, rescanned and created a 500 GB VMFS 5 filesystem and noticed the time if it was longer than 2 minutes it was bad if it was below 10 seconds it was good. git bisect good/bad My network config is: auto bond0 iface bond0 inet static address 10.100.4.62 netmask 255.255.0.0 gateway 10.100.0.1 slaves eth0 eth1 bond-mode 802.3ad bond-miimon 100 auto bond0.101 iface bond0.101 inet static address 10.101.99.4 netmask 255.255.0.0 auto bond1 iface bond1 inet static address 10.100.5.62 netmask 255.255.0.0 slaves eth2 eth3 bond-mode 802.3ad bond-miimon 100 auto bond1.101 iface bond1.101 inet static address 10.101.99.5 netmask 255.255.0.0 I propose to disable tcp_autocorking by default because it obviously degrades iSCSI performance and probably many other protocols. Also the commit mentions that applications can explicitly disable auto corking we probably should do that for the iSCSI target, but I don't know how. Anyone? [1] http://pbot.rmdir.de/a65q6MjgV36tZnn5jS-DUQ [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f54b311142a92ea2e42598e347b84e1655caf8e3 Cheers, Thomas -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/