Re: Kernel Panic in VMXNET3 Driver

2017-07-03 Thread Mini Trader
I am on stretch  Debian 4.9.30-2+deb9u2.  Is there a link to submit this
bug too?

On Ubuntu's 4.4.0-83-generic this is not reproducible.


On Mon, Jul 3, 2017 at 6:45 PM, deloptes <delop...@gmail.com> wrote:

> Mini Trader wrote:
>
> > More information.  I found a post here:
> >
> > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650635
> >
> > They mention LRO so I disabled it.  I cannot reproduce if LRO is
> disabled.
> >
> > On Mon, Jul 3, 2017 at 5:59 PM, Mini Trader <miniflowtra...@gmail.com>
> > wrote:
> >
> >> Looks to be the same.
> >>
> >> On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote:
> >>
> >>> Mini Trader wrote:
> >>>
> >>> > task_numa_fault+0x6ed/0xd20
> >>>
> >>> what happens if you boot the vm with numa=off
> >>>
> >>>
> >>>
> >>>
> >>
>
> the change mentioned in the article is in the kernel - at least 4.11.2 - I
> don't know about your version. The article is about 4.4.
>
> perhaps solved by chance - you have to report back to kernel maintainer I
> guess
>
> 4.11.2 and 4.10.14
>
> line nr 1395 to 1415
> if (VMXNET3_VERSION_GE_2(adapter) &&
> rcd->type == VMXNET3_CDTYPE_RXCOMP_LRO) {
> struct Vmxnet3_RxCompDescExt *rcdlro;
> rcdlro = (struct Vmxnet3_RxCompDescExt
> *)rcd;
>
> segCnt = rcdlro->segCnt;
> WARN_ON_ONCE(segCnt == 0);
> mss = rcdlro->mss;
> if (unlikely(segCnt <= 1))
> segCnt = 0;
> } else {
> segCnt = 0;
> }
> } else {
> BUG_ON(ctx->skb == NULL && !skip_page_frags);
>
> /* non SOP buffer must be type 1 in most cases */
> BUG_ON(rbi->buf_type != VMXNET3_RX_BUF_PAGE);
> BUG_ON(rxd->btype != VMXNET3_RXD_BTYPE_BODY);
>
>
>
>


Re: Kernel Panic in VMXNET3 Driver

2017-07-03 Thread Mini Trader
More information.  I found a post here:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1650635

They mention LRO so I disabled it.  I cannot reproduce if LRO is disabled.

On Mon, Jul 3, 2017 at 5:59 PM, Mini Trader <miniflowtra...@gmail.com>
wrote:

> Looks to be the same.
>
> On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote:
>
>> Mini Trader wrote:
>>
>> > task_numa_fault+0x6ed/0xd20
>>
>> what happens if you boot the vm with numa=off
>>
>>
>>
>>
>


Re: Kernel Panic in VMXNET3 Driver

2017-07-03 Thread Mini Trader
Looks to be the same.

On Mon, Jul 3, 2017 at 5:48 PM, deloptes <delop...@gmail.com> wrote:

> Mini Trader wrote:
>
> > task_numa_fault+0x6ed/0xd20
>
> what happens if you boot the vm with numa=off
>
>
>
>


Kernel Panic in VMXNET3 Driver

2017-07-03 Thread Mini Trader
I had posted earlier that I was having issues with a java program but after
doing some quick digging I've been able to identity the error.  I enabled
logging kernel messages over the network and was able to capture this.
Would appreciate some guidance on where to go from here.

[  118.656721] [ cut here ]
[  118.657261] kernel BUG at
/build/linux-9uDFZV/linux-4.9.30/drivers/net/vmxnet3/vmxnet3_drv.c:1413!
[  118.658106] invalid opcode:  [#1] SMP
[  118.658628] Modules linked in: netconsole configfs sb_edac edac_core
coretemp crct10dif_pclmul ppdev crc32_pclmul vmw_balloon
ghash_clmulni_intel intel_rapl_perf joydev serio_raw pcspkr sg shpchp
vmwgfx vmw_vmci ttm drm_kms_helper drm nfit libnvdimm battery evdev
parport_pc parport ac acpi_cpufreq button ip_tables x_tables autofs4 ext4
crc16 jbd2 crc32c_generic fscrypto ecb mbcache dm_mod sr_mod cdrom sd_mod
ata_generic crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul
ablk_helper cryptd psmouse ata_piix vmxnet3 vmw_pvscsi i2c_piix4 libata
scsi_mod
[  118.661572] CPU: 0 PID: 891 Comm: java Not tainted 4.9.0-3-amd64 #1
Debian 4.9.30-2+deb9u2
[  118.662153] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
Desktop Reference Platform, BIOS 6.00 04/05/2016
[  118.663136] task: 8dd87882a0c0 task.stack: a9998147
[  118.663695] RIP: 0010:[]  []
vmxnet3_rq_rx_complete+0x905/0xf10 [vmxnet3]
[  118.664735] RSP: :8dd87fc03e38  EFLAGS: 00010297
[  118.665279] RAX:  RBX:  RCX:
8dd878ad5a00
[  118.665857] RDX: 0040 RSI: 0001 RDI:
0040
[  118.666426] RBP: 0001 R08:  R09:
0028
[  118.666984] R10:  R11: 8dd8369688c0 R12:
8dd8369690c0
[  118.667613] R13: 8dd835ea8330 R14: 8dd87c192010 R15:
8dd835ed8018
[  118.668172] FS:  7f1b1974e700() GS:8dd87fc0()
knlGS:
[  118.668723] CS:  0010 DS:  ES:  CR0: 80050033
[  118.669310] CR2: 7f1ad63e2fd8 CR3: 7bcac000 CR4:
003406f0
[  118.669892] DR0:  DR1:  DR2:

[  118.670520] DR3:  DR6: fffe0ff0 DR7:
0400
[  118.671099] Stack:
[  118.671677]  8dd836969190 0047 0002
8dd8369690e0
[  118.672313]  b22b002d 8dd8369688c0 
8dd8369688c0
[  118.672950]   8dd8369688c0 8dd8369690e0
0040
[  118.673603] Call Trace:
[  118.674166]   [  118.674180]  [] ?
task_numa_fault+0x6ed/0xd20
[  118.674876]  [] ? vmxnet3_poll_rx_only+0x35/0xa0
[vmxnet3]
[  118.675479]  [] ? net_rx_action+0x240/0x370
[  118.676099]  [] ? __do_softirq+0x105/0x290
[  118.676681]  [] ? irq_exit+0xae/0xb0
[  118.677294]  [] ? do_IRQ+0x4f/0xd0
[  118.677971]  [] ? common_interrupt+0x82/0x82
[  118.678522]   [  118.678532] Code:
89 54 24 28 e8 9d 72 4e f2 0f b6 44 24 30 4c 8b 5c 24 38 4c 8b 54 24 28 49
c7 84 24 48 01 00 00 00 00 00 00 89 c6 e9 59 f8 ff ff <0f> 0b 0f 0b 49 83
84 24 a0 01 00 00 01 49 c7 84 24 48 01 00 00
[  118.680438] RIP  [] vmxnet3_rq_rx_complete+0x905/0xf10
[vmxnet3]
[  118.681063]  RSP 
[  118.681676] ---[ end trace 5927dc1afdb8f3dd ]---
[  118.682244] Kernel panic - not syncing: Fatal exception in interrupt
[  118.682880] Kernel Offset: 0x3120 from 0x8100
(relocation range: 0x8000-0xbfff)
[  118.683993] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt


Re: Tracing hard Lockup

2017-07-01 Thread Mini Trader
Java 7 rarely has issues on Java 8 and if they do it's a program error not
system. At this point I'll go back to Jessie or Ubuntu 16.

Thanks for the feedback.

On Sat, Jul 1, 2017 at 6:41 PM deloptes <delop...@gmail.com> wrote:

> Mini Trader wrote:
>
> > I tried oracles version as well and same response. Is there any sort of
> > utility that can be used to sandbox this and see what is going on?
> >
>
> If a java program was compiled and designed to run on specific version it
> could be that it causes problems on higher version - especially java 1.7 to
> 1.8 changed a lot in how things are handled.
> Did you try with older java version or with the previous one?
>
> I've been using visualvm to banchmark java threading - I'm not sure if it
> will help you. You can set some tight limits to your java program and see
> where it breaks and create a dump. Enable debugging and garbedge collector
> also tight.
>
> As you have a newer kernel, you can set some limits there too - I don't
> recall exactly but there was an option to purge each app using higher than
> limit memory. how it is done in ESX - no idea  can't you run it on
> another machine
>
>


Re: Tracing hard Lockup

2017-07-01 Thread Mini Trader
I tried oracles version as well and same response. Is there any sort of
utility that can be used to sandbox this and see what is going on?

Any log files other than kernlog or syslog?

On Sat, Jul 1, 2017 at 4:42 PM deloptes <delop...@gmail.com> wrote:

> Mini Trader wrote:
>
> > My last two cents on this are that the vCenter consoles shows the machine
> > going to 100% CPU.  I've tried even executing the java process with the
> > nice command to see if this could throttle the CPU.
> >
> > Whatever it is that this java process is doing is completely throwing the
> > system into a loop hole on this release which I find very odd.
>
>
> with no log information or information what exactly this program is doing
> not much can be done.
>
> I've faced some similar issues with wrong or incomplete setups, when wrong
> libraries were on the way or similar. But you say it is a fresh install ...
> so something with openjre/jdk must be wrong. It could be specific to ESX
> and the combination as well, but I have no clue
>
> I prefer using the oracles version actually ...
>
>
>
>
>


Re: Tracing hard Lockup

2017-07-01 Thread Mini Trader
My last two cents on this are that the vCenter consoles shows the machine
going to 100% CPU.  I've tried even executing the java process with the
nice command to see if this could throttle the CPU.

Whatever it is that this java process is doing is completely throwing the
system into a loop hole on this release which I find very odd.

On Sat, Jul 1, 2017 at 10:39 AM, Mini Trader <miniflowtra...@gmail.com>
wrote:

> This isn't reserving JVM heap space > 2GB.  This is a java process taking
> down an entire system.
>
> I didn't have any memory issues.  They do mention that it was fixed in version
> 4.9.30-2+deb9u2. Which I do have installed. The system is a completely new
> and updated install.
>
> On Sat, Jul 1, 2017 at 10:22 AM, deloptes <delop...@gmail.com> wrote:
>
>> Mini Trader wrote:
>>
>> > I've installed Debian 9/Stetch and am having issues with what looks to
>> be
>> > a java application.
>> >
>> > When connecting to the app from certain devices the entire system will
>> > lockup inside an ESXi VM.
>> >
>> > There doesn't appear to be any errors.  The system just locks up, no
>> > console, no IP nothing works. The ESXi host is good as it runs
>> everything
>> > else fine and has so for a long time.
>> >
>> > Additionally, I've run this application on other systems (kernel
>> > 4.4/ubuntu
>> > 16 LTS) and have no issues at all.  How can I track this down?  Nothing
>> is
>> > in syslog or kernlog.
>> >
>> > Thanks!
>>
>> There was another thread "Problem reserving enough space for Java object
>> heap since stretch upgrade"
>>
>> regards
>>
>>
>


Re: Tracing hard Lockup

2017-07-01 Thread Mini Trader
This isn't reserving JVM heap space > 2GB.  This is a java process taking
down an entire system.

I didn't have any memory issues.  They do mention that it was fixed in version
4.9.30-2+deb9u2. Which I do have installed. The system is a completely new
and updated install.

On Sat, Jul 1, 2017 at 10:22 AM, deloptes <delop...@gmail.com> wrote:

> Mini Trader wrote:
>
> > I've installed Debian 9/Stetch and am having issues with what looks to be
> > a java application.
> >
> > When connecting to the app from certain devices the entire system will
> > lockup inside an ESXi VM.
> >
> > There doesn't appear to be any errors.  The system just locks up, no
> > console, no IP nothing works. The ESXi host is good as it runs everything
> > else fine and has so for a long time.
> >
> > Additionally, I've run this application on other systems (kernel
> > 4.4/ubuntu
> > 16 LTS) and have no issues at all.  How can I track this down?  Nothing
> is
> > in syslog or kernlog.
> >
> > Thanks!
>
> There was another thread "Problem reserving enough space for Java object
> heap since stretch upgrade"
>
> regards
>
>


Tracing hard Lockup

2017-07-01 Thread Mini Trader
I've installed Debian 9/Stetch and am having issues with what looks to be a
java application.

When connecting to the app from certain devices the entire system will
lockup inside an ESXi VM.

There doesn't appear to be any errors.  The system just locks up, no
console, no IP nothing works. The ESXi host is good as it runs everything
else fine and has so for a long time.

Additionally, I've run this application on other systems (kernel 4.4/ubuntu
16 LTS) and have no issues at all.  How can I track this down?  Nothing is
in syslog or kernlog.

Thanks!


Tracing Hard Lockup

2017-06-30 Thread Mini Trader
I've installed Debian 9/Stetch and am having issues with what looks to be a
java application.

When connecting to the app from certain devices the entire system will
lockup inside an ESXi VM.

There doesn't appear to be any errors.  The system just locks up, no more
IP nothing works.  The ESXi host is good as it runs everything else fine
and has so for a long time.

Additionally, I've run this application on other systems (kernel 4.4/ubuntu
16 LTS) and have no issues at all.  How can I track this down?  Nothing is
in syslog or kernlog.

Thanks!


Re: NFS 4 id mapping does not work on auto mount

2017-01-08 Thread Mini Trader
So for whatever reason. It seems that there is a slight delay after boot
for the ids to be pulled. They come after around 10 minutes or so.

Issuing nfsidmap -c will cause them to be loaded immediately.

On Sat, Jan 7, 2017 at 2:40 PM Mini Trader <miniflowtra...@gmail.com> wrote:

> Hello all,
>
> I am having some issues with ID mapping.
>
> The behavior that I notice is as follows.
>
> 1. If I mount via /etc/fstab when automount is enabled, then I have no id
> mapping
> 2. If I mount after boot via mount -a (fstab contains noauto entry) my id
> mapping is correct
>
> Client is running:
>
> Linux backup 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19)
> x86_64 GNU/Linux
>
> Any thoughts on what this might be?  Something with timing or order.
>
> Thanks.
>
>
>
>
>
>
>
>


NFS 4 id mapping does not work on auto mount

2017-01-07 Thread Mini Trader
Hello all,

I am having some issues with ID mapping.

The behavior that I notice is as follows.

1. If I mount via /etc/fstab when automount is enabled, then I have no id
mapping
2. If I mount after boot via mount -a (fstab contains noauto entry) my id
mapping is correct

Client is running:

Linux backup 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19)
x86_64 GNU/Linux

Any thoughts on what this might be?  Something with timing or order.

Thanks.