Can you try the previous driver? I think there's known issues with the latest.

Todd Fujinaka
Software Application Engineer
Data Center Group
Intel Corporation
todd.fujin...@intel.com

-----Original Message-----
From: Fujinaka, Todd <todd.fujin...@intel.com> 
Sent: Wednesday, December 30, 2020 8:03 AM
To: Marc 'risson' Schmitt <ris...@cri.epita.fr>; 
e1000-devel@lists.sourceforge.net
Cc: c...@cri.epita.fr
Subject: Re: [E1000-devel] [i40e][bug] driver crashes machine under high 
network pressure

Unfortunately the kernel crash dump tells us very little besides that you were 
running networking at the time of the dump.

I would suggest that you file a bug here and attach the full dmesg.

Be advised that we generally need to reproduce the issue to make much progress 
and I don't know if we have any AMD systems.

Todd Fujinaka
Software Application Engineer
Data Center Group
Intel Corporation
todd.fujin...@intel.com

-----Original Message-----
From: Marc 'risson' Schmitt <ris...@cri.epita.fr> 
Sent: Tuesday, December 29, 2020 5:39 PM
To: Fujinaka, Todd <todd.fujin...@intel.com>; e1000-devel@lists.sourceforge.net
Cc: c...@cri.epita.fr
Subject: Re: [E1000-devel] [i40e][bug] driver crashes machine under high 
network pressure

Hi,

First, thanks for your swift response!

On 12/30/20 2:22 AM, Fujinaka, Todd wrote:
> First, sourceforge strips attachment so if you want to submit them you need 
> to open a bug and attach the files there.

I'll attach an extract of the kernel logs at the end of this email.
> Second, if the hardware is Dell, you need to submit the issue to Dell and 
> they will involve us if they need help. They want to troubleshoot problems 
> with their hardware because they need to track the issues. If it is Dell 
> hardware, don't open the bug here because we'll just have to tell you again 
> to submit the issue to Dell.
> 
> The third comment is that this looks like a possible known issue and with 
> Dell hardware you need to use the Dell-approved firmware and drivers. They 
> customize the hardware and firmware and you can't use the generic versions.

The mentioned X722-DA2 were acquired from Intel directly and installed in the 
server by us. The server was indeed acquired from Dell.
The firmware for those NICs was also upgraded by us.

Regards,

--
Marc 'risson' Schmitt
CRI - EPITA

kernel: BUG: Bad page state in process swapper/20  pfn:79d345
kernel: page:fffff9f9de74d140 refcount:-1 mapcount:0
mapping:0000000000000000 index:0x0
kernel: flags: 0x57ffffc0000000()
kernel: raw: 0057ffffc0000000 dead000000000100 dead000000000122
0000000000000000
kernel: raw: 0000000000000000 0000000000000000 ffffffffffffffff
0000000000000000
kernel: page dumped because: nonzero _refcount
kernel: Modules linked in: cfg80211 xt_conntrack xt_MASQUERADE 
nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ dge aufs overlay 
dell_rbu 8021q garp mrp stp llc bonding nls_iso8859_1 dm_multipath scsi_dh_rdac 
scsi_dh_emc scsi_dh_alua ipmi_ssif amd64_edac_mod edac_mce_amd amd_energy 
joydev input_leds cdc_ether dcdbas dell_wmi_descriptor wmi_bmof efi_pstore ccp 
k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter mac_hid 
sch_fq_codel ip_tables x_tables autofs4 btrfs blake2b_generic raid1 async_pq 
async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 
hid_generic usbhid hid crct10dif_pclmul crc32_pclmul mgag200 
ghash_clmulni_intel i2c_algo_bit aesni_intel drm_kms_help rect glue_helper 
sysimgblt fb_sys_fops nvme
kernel:  cec ahci rc_core tg3 nvme_core libahci drm i40e(OE) xhci_pci
i2c_piix4 xhci_pci_renesas wmi
kernel: CPU: 20 PID: 0 Comm: swapper/20 Tainted: G    B   W  OE
5.8.0-33-generic #36-Ubuntu
kernel: Hardware name: Dell Inc. PowerEdge R6525/0GK70M, BIOS 1.4.8
05/06/2020
kernel: Call Trace:
kernel:  <IRQ>
kernel:  show_stack+0x52/0x58
kernel:  dump_stack+0x70/0x8d
kernel:  bad_page.cold+0x63/0x94
kernel:  check_new_page_bad+0x6d/0x80
kernel:  rmqueue_bulk.constprop.0+0x38f/0x4c0
kernel:  rmqueue_pcplist.constprop.0+0x128/0x150
kernel:  rmqueue+0x3e/0x770
kernel:  get_page_from_freelist+0x197/0x2c0
kernel:  __alloc_pages_nodemask+0x15d/0x300
kernel:  i40e_alloc_rx_buffers+0x14a/0x260 [i40e]
kernel:  i40e_napi_poll+0xda3/0x1720 [i40e]
kernel:  napi_poll+0x96/0x1b0
kernel:  net_rx_action+0xb8/0x1c0
kernel:  __do_softirq+0xd0/0x2a1
kernel:  asm_call_irq_on_stack+0x12/0x20
kernel:  </IRQ>
kernel:  do_softirq_own_stack+0x3d/0x50
kernel:  irq_exit_rcu+0x95/0xd0
kernel:  common_interrupt+0x7c/0x150
kernel:  asm_common_interrupt+0x1e/0x40
kernel: RIP: 0010:native_safe_halt+0xe/0x10
kernel: Code: e5 8b 74 d0 04 8b 3c d0 e8 6f b3 49 ff 5d c3 cc cc cc cc cc cc cc 
cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 66 ee 43 00 fb f4 <c3> 90 e9 07 00 00 
00 0f 00 2d 56 ee 43 00 f4 c3 cc cc 0f
kernel: RSP: 0018:ffffa8d68033fe70 EFLAGS: 00000246
kernel: RAX: ffffffff94fcd3a0 RBX: ffff98a39ae5af00 RCX: ffff98a39f0ad440
kernel: RDX: 0000000004fd7af6 RSI: 0000000000000014 RDI: ffff98a39f09fa80
kernel: RBP: ffffa8d68033fe90 R08: 00000066a171bc54 R09: 0000000000000202
kernel: R10: 000000000003222e R11: 0000000000000000 R12: 0000000000000014
kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel Ethernet, visit 
https://forums.intel.com/s/topic/0TO0P00000018NbWAI/intel-ethernet


_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel Ethernet, visit 
https://forums.intel.com/s/topic/0TO0P00000018NbWAI/intel-ethernet

Reply via email to