Re: Kernel Panic in VMXNET3 Driver
I am on stretch Debian 4.9.30-2+deb9u2. Is there a link to submit this bug too? On Ubuntu's 4.4.0-83-generic this is not reproducible. On Mon, Jul 3, 2017 at 6:45 PM, deloptes <delop...@gmail.com> wrote: > Mini Trader wrote: > > > More information. I found a post here: > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650635 > > > > They mention LRO so I disabled it. I cannot reproduce if LRO is > disabled. > > > > On Mon, Jul 3, 2017 at 5:59 PM, Mini Trader <miniflowtra...@gmail.com> > > wrote: > > > >> Looks to be the same. > >> > >> On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote: > >> > >>> Mini Trader wrote: > >>> > >>> > task_numa_fault+0x6ed/0xd20 > >>> > >>> what happens if you boot the vm with numa=off > >>> > >>> > >>> > >>> > >> > > the change mentioned in the article is in the kernel - at least 4.11.2 - I > don't know about your version. The article is about 4.4. > > perhaps solved by chance - you have to report back to kernel maintainer I > guess > > 4.11.2 and 4.10.14 > > line nr 1395 to 1415 > if (VMXNET3_VERSION_GE_2(adapter) && > rcd->type == VMXNET3_CDTYPE_RXCOMP_LRO) { > struct Vmxnet3_RxCompDescExt *rcdlro; > rcdlro = (struct Vmxnet3_RxCompDescExt > *)rcd; > > segCnt = rcdlro->segCnt; > WARN_ON_ONCE(segCnt == 0); > mss = rcdlro->mss; > if (unlikely(segCnt <= 1)) > segCnt = 0; > } else { > segCnt = 0; > } > } else { > BUG_ON(ctx->skb == NULL && !skip_page_frags); > > /* non SOP buffer must be type 1 in most cases */ > BUG_ON(rbi->buf_type != VMXNET3_RX_BUF_PAGE); > BUG_ON(rxd->btype != VMXNET3_RXD_BTYPE_BODY); > > > >
Re: Kernel Panic in VMXNET3 Driver
More information. I found a post here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650635 They mention LRO so I disabled it. I cannot reproduce if LRO is disabled. On Mon, Jul 3, 2017 at 5:59 PM, Mini Trader <miniflowtra...@gmail.com> wrote: > Looks to be the same. > > On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote: > >> Mini Trader wrote: >> >> > task_numa_fault+0x6ed/0xd20 >> >> what happens if you boot the vm with numa=off >> >> >> >> >
Re: Kernel Panic in VMXNET3 Driver
Looks to be the same. On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote: > Mini Trader wrote: > > > task_numa_fault+0x6ed/0xd20 > > what happens if you boot the vm with numa=off > > > >
Kernel Panic in VMXNET3 Driver
I had posted earlier that I was having issues with a java program but after doing some quick digging I've been able to identity the error. I enabled logging kernel messages over the network and was able to capture this. Would appreciate some guidance on where to go from here. [ 118.656721] [ cut here ] [ 118.657261] kernel BUG at /build/linux-9uDFZV/linux-4.9.30/drivers/net/vmxnet3/vmxnet3_drv.c:1413! [ 118.658106] invalid opcode: [#1] SMP [ 118.658628] Modules linked in: netconsole configfs sb_edac edac_core coretemp crct10dif_pclmul ppdev crc32_pclmul vmw_balloon ghash_clmulni_intel intel_rapl_perf joydev serio_raw pcspkr sg shpchp vmwgfx vmw_vmci ttm drm_kms_helper drm nfit libnvdimm battery evdev parport_pc parport ac acpi_cpufreq button ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache dm_mod sr_mod cdrom sd_mod ata_generic crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse ata_piix vmxnet3 vmw_pvscsi i2c_piix4 libata scsi_mod [ 118.661572] CPU: 0 PID: 891 Comm: java Not tainted 4.9.0-3-amd64 #1 Debian 4.9.30-2+deb9u2 [ 118.662153] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 [ 118.663136] task: 8dd87882a0c0 task.stack: a9998147 [ 118.663695] RIP: 0010:[] [] vmxnet3_rq_rx_complete+0x905/0xf10 [vmxnet3] [ 118.664735] RSP: :8dd87fc03e38 EFLAGS: 00010297 [ 118.665279] RAX: RBX: RCX: 8dd878ad5a00 [ 118.665857] RDX: 0040 RSI: 0001 RDI: 0040 [ 118.666426] RBP: 0001 R08: R09: 0028 [ 118.666984] R10: R11: 8dd8369688c0 R12: 8dd8369690c0 [ 118.667613] R13: 8dd835ea8330 R14: 8dd87c192010 R15: 8dd835ed8018 [ 118.668172] FS: 7f1b1974e700() GS:8dd87fc0() knlGS: [ 118.668723] CS: 0010 DS: ES: CR0: 80050033 [ 118.669310] CR2: 7f1ad63e2fd8 CR3: 7bcac000 CR4: 003406f0 [ 118.669892] DR0: DR1: DR2: [ 118.670520] DR3: DR6: fffe0ff0 DR7: 0400 [ 118.671099] Stack: [ 118.671677] 8dd836969190 0047 0002 8dd8369690e0 [ 118.672313] b22b002d 8dd8369688c0 8dd8369688c0 [ 118.672950] 8dd8369688c0 8dd8369690e0 0040 [ 118.673603] Call Trace: [ 118.674166] [ 118.674180] [] ? task_numa_fault+0x6ed/0xd20 [ 118.674876] [] ? vmxnet3_poll_rx_only+0x35/0xa0 [vmxnet3] [ 118.675479] [] ? net_rx_action+0x240/0x370 [ 118.676099] [] ? __do_softirq+0x105/0x290 [ 118.676681] [] ? irq_exit+0xae/0xb0 [ 118.677294] [] ? do_IRQ+0x4f/0xd0 [ 118.677971] [] ? common_interrupt+0x82/0x82 [ 118.678522] [ 118.678532] Code: 89 54 24 28 e8 9d 72 4e f2 0f b6 44 24 30 4c 8b 5c 24 38 4c 8b 54 24 28 49 c7 84 24 48 01 00 00 00 00 00 00 89 c6 e9 59 f8 ff ff <0f> 0b 0f 0b 49 83 84 24 a0 01 00 00 01 49 c7 84 24 48 01 00 00 [ 118.680438] RIP [] vmxnet3_rq_rx_complete+0x905/0xf10 [vmxnet3] [ 118.681063] RSP [ 118.681676] ---[ end trace 5927dc1afdb8f3dd ]--- [ 118.682244] Kernel panic - not syncing: Fatal exception in interrupt [ 118.682880] Kernel Offset: 0x3120 from 0x8100 (relocation range: 0x8000-0xbfff) [ 118.683993] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Re: Tracing hard Lockup
Java 7 rarely has issues on Java 8 and if they do it's a program error not system. At this point I'll go back to Jessie or Ubuntu 16. Thanks for the feedback. On Sat, Jul 1, 2017 at 6:41 PM deloptes <delop...@gmail.com> wrote: > Mini Trader wrote: > > > I tried oracles version as well and same response. Is there any sort of > > utility that can be used to sandbox this and see what is going on? > > > > If a java program was compiled and designed to run on specific version it > could be that it causes problems on higher version - especially java 1.7 to > 1.8 changed a lot in how things are handled. > Did you try with older java version or with the previous one? > > I've been using visualvm to banchmark java threading - I'm not sure if it > will help you. You can set some tight limits to your java program and see > where it breaks and create a dump. Enable debugging and garbedge collector > also tight. > > As you have a newer kernel, you can set some limits there too - I don't > recall exactly but there was an option to purge each app using higher than > limit memory. how it is done in ESX - no idea can't you run it on > another machine > >
Re: Tracing hard Lockup
I tried oracles version as well and same response. Is there any sort of utility that can be used to sandbox this and see what is going on? Any log files other than kernlog or syslog? On Sat, Jul 1, 2017 at 4:42 PM deloptes <delop...@gmail.com> wrote: > Mini Trader wrote: > > > My last two cents on this are that the vCenter consoles shows the machine > > going to 100% CPU. I've tried even executing the java process with the > > nice command to see if this could throttle the CPU. > > > > Whatever it is that this java process is doing is completely throwing the > > system into a loop hole on this release which I find very odd. > > > with no log information or information what exactly this program is doing > not much can be done. > > I've faced some similar issues with wrong or incomplete setups, when wrong > libraries were on the way or similar. But you say it is a fresh install ... > so something with openjre/jdk must be wrong. It could be specific to ESX > and the combination as well, but I have no clue > > I prefer using the oracles version actually ... > > > > >
Re: Tracing hard Lockup
My last two cents on this are that the vCenter consoles shows the machine going to 100% CPU. I've tried even executing the java process with the nice command to see if this could throttle the CPU. Whatever it is that this java process is doing is completely throwing the system into a loop hole on this release which I find very odd. On Sat, Jul 1, 2017 at 10:39 AM, Mini Trader <miniflowtra...@gmail.com> wrote: > This isn't reserving JVM heap space > 2GB. This is a java process taking > down an entire system. > > I didn't have any memory issues. They do mention that it was fixed in version > 4.9.30-2+deb9u2. Which I do have installed. The system is a completely new > and updated install. > > On Sat, Jul 1, 2017 at 10:22 AM, deloptes <delop...@gmail.com> wrote: > >> Mini Trader wrote: >> >> > I've installed Debian 9/Stetch and am having issues with what looks to >> be >> > a java application. >> > >> > When connecting to the app from certain devices the entire system will >> > lockup inside an ESXi VM. >> > >> > There doesn't appear to be any errors. The system just locks up, no >> > console, no IP nothing works. The ESXi host is good as it runs >> everything >> > else fine and has so for a long time. >> > >> > Additionally, I've run this application on other systems (kernel >> > 4.4/ubuntu >> > 16 LTS) and have no issues at all. How can I track this down? Nothing >> is >> > in syslog or kernlog. >> > >> > Thanks! >> >> There was another thread "Problem reserving enough space for Java object >> heap since stretch upgrade" >> >> regards >> >> >
Re: Tracing hard Lockup
This isn't reserving JVM heap space > 2GB. This is a java process taking down an entire system. I didn't have any memory issues. They do mention that it was fixed in version 4.9.30-2+deb9u2. Which I do have installed. The system is a completely new and updated install. On Sat, Jul 1, 2017 at 10:22 AM, deloptes <delop...@gmail.com> wrote: > Mini Trader wrote: > > > I've installed Debian 9/Stetch and am having issues with what looks to be > > a java application. > > > > When connecting to the app from certain devices the entire system will > > lockup inside an ESXi VM. > > > > There doesn't appear to be any errors. The system just locks up, no > > console, no IP nothing works. The ESXi host is good as it runs everything > > else fine and has so for a long time. > > > > Additionally, I've run this application on other systems (kernel > > 4.4/ubuntu > > 16 LTS) and have no issues at all. How can I track this down? Nothing > is > > in syslog or kernlog. > > > > Thanks! > > There was another thread "Problem reserving enough space for Java object > heap since stretch upgrade" > > regards > >
Tracing hard Lockup
I've installed Debian 9/Stetch and am having issues with what looks to be a java application. When connecting to the app from certain devices the entire system will lockup inside an ESXi VM. There doesn't appear to be any errors. The system just locks up, no console, no IP nothing works. The ESXi host is good as it runs everything else fine and has so for a long time. Additionally, I've run this application on other systems (kernel 4.4/ubuntu 16 LTS) and have no issues at all. How can I track this down? Nothing is in syslog or kernlog. Thanks!
Tracing Hard Lockup
I've installed Debian 9/Stetch and am having issues with what looks to be a java application. When connecting to the app from certain devices the entire system will lockup inside an ESXi VM. There doesn't appear to be any errors. The system just locks up, no more IP nothing works. The ESXi host is good as it runs everything else fine and has so for a long time. Additionally, I've run this application on other systems (kernel 4.4/ubuntu 16 LTS) and have no issues at all. How can I track this down? Nothing is in syslog or kernlog. Thanks!
Re: NFS 4 id mapping does not work on auto mount
So for whatever reason. It seems that there is a slight delay after boot for the ids to be pulled. They come after around 10 minutes or so. Issuing nfsidmap -c will cause them to be loaded immediately. On Sat, Jan 7, 2017 at 2:40 PM Mini Trader <miniflowtra...@gmail.com> wrote: > Hello all, > > I am having some issues with ID mapping. > > The behavior that I notice is as follows. > > 1. If I mount via /etc/fstab when automount is enabled, then I have no id > mapping > 2. If I mount after boot via mount -a (fstab contains noauto entry) my id > mapping is correct > > Client is running: > > Linux backup 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) > x86_64 GNU/Linux > > Any thoughts on what this might be? Something with timing or order. > > Thanks. > > > > > > > >
NFS 4 id mapping does not work on auto mount
Hello all, I am having some issues with ID mapping. The behavior that I notice is as follows. 1. If I mount via /etc/fstab when automount is enabled, then I have no id mapping 2. If I mount after boot via mount -a (fstab contains noauto entry) my id mapping is correct Client is running: Linux backup 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux Any thoughts on what this might be? Something with timing or order. Thanks.