Re: Myrinet 10Gb odd behavior - SOLVED
Spoke to soon. Fine for a while (doing a 5 day rsync of 38TB) but getting those errors every 7 min. And I'm only getting 1.24Gb/s over a 10Gb jumbo link. Definitely causing connection issues. Using it for ethernet. Gonna go in tomorrow and give my Solarflare another shot as it was giving me issues but the rel notes say to try this, so I will; - The driver uses mbufs to store packet data which come from a set of pools of limted size. See man 7 tuning for more details. The following command can display the number of used and free mbufs within the pools the Solarflare driver uses # vmstat -z | head -n 1; vmstat -z | grep mbuf ITEM SIZE LIMIT USED FREE REQUESTS FAILURES mbuf_cluster:2048,25600, 1408, 658,31604,0 mbuf_jumbo_page: 4096,12800,0, 76, 2063,0 mbuf_jumbo_9k: 9216, 6400,0,0,0,0 mbuf_jumbo_16k: 16384, 3200,0,0,0,0 If a pool is exhausted (i.e. the failure count in the right hand column is non-zero, networking applications may hang or received packets may be dropped. Hence you may need to increase these limits using the following sysctls: kern.ipc.nmbclusters (for mbuf_cluster) kern.ipc.nmbjumbop (for mbuf_jumbo_page) kern.ipc.nmbjumbo9 (for mbuf_jumbo_9k) kern.ipc.nmbjumbo16 (for mbuf_jumbo_16k) - aurf On Aug 17, 2013, at 8:14 PM, iamatt wrote: > Wow myricom still around... used to use the lanai stuff never on bsd though. > All FDR Infiniband these days. Are you using the myrinet protocol or > ethernet, just curious. Glad you got it working! > > On Aug 16, 2013 8:12 PM, "aurfalien" wrote: > > On Aug 16, 2013, at 8:47 AM, aurfalien wrote: > > > Forgot to mention my loader.conf; > > > > if_mxge_load="YES" > > mxge_ethp_z8e_load="YES" > > mxge_eth_z8e_load="YES" > > mxge_rss_ethp_z8e_load="YES" > > mxge_rss_eth_z8e_load="YES" > > > > > > I blindly added these w/o thinking what they do. > > > > Should I simply only load the first line? > > > > - aurf > > > > > > On Aug 16, 2013, at 8:18 AM, aurfalien wrote: > > > >> Hi, > >> > >> I've been suspecting my NIC is not up to par and notice this in the logs > >> every few minutes; > >> > >> Aug 16 08:05:06 prometheus kernel: mxge0: slice 0 struck? ring state: > >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.req=1914503981 > >> tx.done=1914503810, tx.queue_active=0 > >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.activate=0 tx.deactivate=0 > >> Aug 16 08:05:06 prometheus kernel: mxge0: pkt_done=1824019832 fw=1824019931 > >> Aug 16 08:05:06 prometheus kernel: mxge0: Watchdog reset! > >> Aug 16 08:05:06 prometheus kernel: mxge0: NIC did not reboot, not resetting > >> > >> Could tis be effecting throughput? > >> > >> My card is a Myri-10G-PCIE-8A > >> > >> I did install the Myrinet dev tools for FreeBSD and ran myri_info which > >> yields; > >> > >> pci-dev at 05:00.0 vendor:product(rev)=14c1:0008(00) > >> behind bridge root-port: 00:03.0 8086:3c08 (x8.1/x16.3) > >> Myri-10G-PCIE-8A -- Link x8 > >> EEPROM String-spec: > >> MAC=00:60:dd:45:73:23 > >> SN=413665 > >> PWR=100 > >> PC=10G-PCIE-8A-R > >> PN=09-03852 > >> XFI=AEL1010 > >> TAG=ze_tools-1_4_45 > >> > >> EEPROM MCP, PRESENT, length = 103384, crc=0x119daf46 > >> ETHZ::1.4.45 2009/08/22 18:57:06 self extracting firmware > >> Bundle: exec_len=72144, PCI-ROM-len = 31232 > >> Running MCP: > >> ETH ::1.4.55 -P- 2012/04/21 01:48:34 myri10ge firmware > >> > >> Any insights are appreciated. > >> > >> - aurf > > > Did the ole RTFM and re programmed the firmware, all good now. > > - aurf > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Myrinet 10Gb odd behavior - SOLVED
Wow myricom still around... used to use the lanai stuff never on bsd though. All FDR Infiniband these days. Are you using the myrinet protocol or ethernet, just curious. Glad you got it working! On Aug 16, 2013 8:12 PM, "aurfalien" wrote: > > On Aug 16, 2013, at 8:47 AM, aurfalien wrote: > > > Forgot to mention my loader.conf; > > > > if_mxge_load="YES" > > mxge_ethp_z8e_load="YES" > > mxge_eth_z8e_load="YES" > > mxge_rss_ethp_z8e_load="YES" > > mxge_rss_eth_z8e_load="YES" > > > > > > I blindly added these w/o thinking what they do. > > > > Should I simply only load the first line? > > > > - aurf > > > > > > On Aug 16, 2013, at 8:18 AM, aurfalien wrote: > > > >> Hi, > >> > >> I've been suspecting my NIC is not up to par and notice this in the > logs every few minutes; > >> > >> Aug 16 08:05:06 prometheus kernel: mxge0: slice 0 struck? ring state: > >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.req=1914503981 > tx.done=1914503810, tx.queue_active=0 > >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.activate=0 tx.deactivate=0 > >> Aug 16 08:05:06 prometheus kernel: mxge0: pkt_done=1824019832 > fw=1824019931 > >> Aug 16 08:05:06 prometheus kernel: mxge0: Watchdog reset! > >> Aug 16 08:05:06 prometheus kernel: mxge0: NIC did not reboot, not > resetting > >> > >> Could tis be effecting throughput? > >> > >> My card is a Myri-10G-PCIE-8A > >> > >> I did install the Myrinet dev tools for FreeBSD and ran myri_info which > yields; > >> > >> pci-dev at 05:00.0 vendor:product(rev)=14c1:0008(00) > >> behind bridge root-port: 00:03.0 8086:3c08 (x8.1/x16.3) > >> Myri-10G-PCIE-8A -- Link x8 > >> EEPROM String-spec: > >> MAC=00:60:dd:45:73:23 > >> SN=413665 > >> PWR=100 > >> PC=10G-PCIE-8A-R > >> PN=09-03852 > >> XFI=AEL1010 > >> TAG=ze_tools-1_4_45 > >> > >> EEPROM MCP, PRESENT, length = 103384, crc=0x119daf46 > >> ETHZ::1.4.45 2009/08/22 18:57:06 self extracting firmware > >> Bundle: exec_len=72144, PCI-ROM-len = 31232 > >> Running MCP: > >> ETH ::1.4.55 -P- 2012/04/21 01:48:34 myri10ge firmware > >> > >> Any insights are appreciated. > >> > >> - aurf > > > Did the ole RTFM and re programmed the firmware, all good now. > > - aurf > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to " > freebsd-questions-unsubscr...@freebsd.org" > ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Myrinet 10Gb odd behavior - SOLVED
On Aug 16, 2013, at 8:47 AM, aurfalien wrote: > Forgot to mention my loader.conf; > > if_mxge_load="YES" > mxge_ethp_z8e_load="YES" > mxge_eth_z8e_load="YES" > mxge_rss_ethp_z8e_load="YES" > mxge_rss_eth_z8e_load="YES" > > > I blindly added these w/o thinking what they do. > > Should I simply only load the first line? > > - aurf > > > On Aug 16, 2013, at 8:18 AM, aurfalien wrote: > >> Hi, >> >> I've been suspecting my NIC is not up to par and notice this in the logs >> every few minutes; >> >> Aug 16 08:05:06 prometheus kernel: mxge0: slice 0 struck? ring state: >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.req=1914503981 >> tx.done=1914503810, tx.queue_active=0 >> Aug 16 08:05:06 prometheus kernel: mxge0: tx.activate=0 tx.deactivate=0 >> Aug 16 08:05:06 prometheus kernel: mxge0: pkt_done=1824019832 fw=1824019931 >> Aug 16 08:05:06 prometheus kernel: mxge0: Watchdog reset! >> Aug 16 08:05:06 prometheus kernel: mxge0: NIC did not reboot, not resetting >> >> Could tis be effecting throughput? >> >> My card is a Myri-10G-PCIE-8A >> >> I did install the Myrinet dev tools for FreeBSD and ran myri_info which >> yields; >> >> pci-dev at 05:00.0 vendor:product(rev)=14c1:0008(00) >> behind bridge root-port: 00:03.0 8086:3c08 (x8.1/x16.3) >> Myri-10G-PCIE-8A -- Link x8 >> EEPROM String-spec: >> MAC=00:60:dd:45:73:23 >> SN=413665 >> PWR=100 >> PC=10G-PCIE-8A-R >> PN=09-03852 >> XFI=AEL1010 >> TAG=ze_tools-1_4_45 >> >> EEPROM MCP, PRESENT, length = 103384, crc=0x119daf46 >> ETHZ::1.4.45 2009/08/22 18:57:06 self extracting firmware >> Bundle: exec_len=72144, PCI-ROM-len = 31232 >> Running MCP: >> ETH ::1.4.55 -P- 2012/04/21 01:48:34 myri10ge firmware >> >> Any insights are appreciated. >> >> - aurf Did the ole RTFM and re programmed the firmware, all good now. - aurf ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"