Re: [Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt

2015-11-08 Thread Bruce Evans

On Mon, 9 Nov 2015 bugzilla-nore...@freebsd.org wrote:


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

--- Comment #2 from Conrad E. Meyer  ---
If ARM is anything like amd64, it just spinwaits in IPI_STOP (waiting for the
CPU
to be re-enabled).  On amd64 you could reduce it to 2 CPUs spinning pretty
easily
(hlt any non-panic and non-BSP core -- they'll never be needed until reboot).
But that still leaves 2 CPUs spinning.

The patch attempted to hlt all non-panic CPUs in IPI_STOP, but leave interrupts
enabled so they could be woken again.  This does Not Work Well in panic context
(I forget the details, but if you've paniced you really don't want normal
interrupt
code running on the non-ddb CPU(s)).


Enabling normal interrupts breaks ddb context too.

ddb is already broken in restarting other CPUs when it single steps.
This usually enables interrupts on other CPUs (if not the current one),
so the state might be completely different after you step a single
instruction.  Just like it might be in normal operation for unlocked
states, but more so since in normal operation the single instruction
runs in a few cycle but for single stepping it takes thousands or
millions of cycles in real time (and the other CPUs run many of thos
cycles in real time after they are restarted before they are stopped
again).  But it is inconvenient for the state that you are trying to
debug to change much.

Bruce
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


Re: [Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt

2015-11-08 Thread Bruce Evans

On Sun, 8 Nov 2015 comment-igno...@freebsd.org wrote:


--- Comment #1 from NGie Cooper  ---
It's not just arm64; amd64 does/did a horrible job at yielding when in the
debugger (part of the reason why we have a script which goes and suspends test
VMs at $work if/when they panic).


I thought it worked OK since it uses cpu_pause().  But Haswell runs
much hotter with cpu_pause() than with an empty loop.  monitor+mwait
works better.  For normal idling, here it runs about 10 degrees C
hotter than ACPI C2.  Loops in userland without cpu_pause() (foo:
jmp foo) run 25-30 degees hotter.  Loops in userland with cpu_pause()
(foo: pause; jmp foo) run 25-35 degrees hotter.  ddb is about the same
as this.  But normal idling with cpu_pause() runs 40-45 degrees hotter.
So ddb saves power compared with misconfigured normal idling :-).


Conrad had a patch out for amd64 a few months ago which yielded in the debugger
a bit on amd64, but IIRC there were issues at the time. I'll let him comment on
it though.

It would be nice if dropping into the debugger didn't spin all the CPUs at
~100% though.


CPUs stopped by the the debugger or anything else cannot do any normal
yielding or interrupt handling.  Perhaps they can halt with interrupts
disabled and depend on an NMI to wake them up.  NMIs are hard to program,
especially near kdb, but stopping CPUs already depends on NMIs to promptly
stop ones that are spinning with interrupts disabled.  When a CPU is
stopped, it must not handle any NMIs, but just let them kick it out of
a halt, and halt again unless the start/stop masks allow them to proceed.

I think all console drivers are too stupid to use cpu_pause() in
spinloops waiting for input, so sitting at the debugger prompt burns
at least 1 CPU, but since cpu_pause() apparently doesn't work very
well it might not make much difference.

cpu_pause() seemed to help when I used it to make the busy-waiting in
DEVICE_POLLING in idle less harmful.  Perhaps it works better in i/o
loops.  That might help the console drivers' i/o loops too.  But userland
tests shows that doing a slow ISA i/o is cool enough by itself (perhaps
a little cooler than the dumb spinloop, and now adding the pause makes
little difference).  It seems reasonable to expect the CPU to mostly
shut down itself when waiting 5000+ cycles for slow i/o if neither
is emulated, and blame the emulation if it uses too many cycles or too
few cycles to emulate this.

The normal idle spinloop must be doing something stupid to run so much
hotter.  It calls sched_runnable() in a loop.  This obviously uses more
CPU resources than "foo: jmp foo", but in other tests I never saw much
difference caused by the instruction mix.  Apparently sched_runnable()
uses lots of CPU resources and pausing probably makes little difference.
Normal idle needs to wake up fast so it needs to check a lot, but even
the function call for this is wasteful.  Stopped CPUs don't need to
restart so fast (except for faster tracing in ddb -- it is already
about 1000 times too slow).

Bruce
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

--- Comment #2 from Conrad E. Meyer  ---
If ARM is anything like amd64, it just spinwaits in IPI_STOP (waiting for the
CPU
to be re-enabled).  On amd64 you could reduce it to 2 CPUs spinning pretty
easily
(hlt any non-panic and non-BSP core -- they'll never be needed until reboot).
But that still leaves 2 CPUs spinning.

The patch attempted to hlt all non-panic CPUs in IPI_STOP, but leave interrupts
enabled so they could be woken again.  This does Not Work Well in panic context
(I forget the details, but if you've paniced you really don't want normal
interrupt
code running on the non-ddb CPU(s)).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

NGie Cooper  changed:

   What|Removed |Added

 CC||c...@freebsd.org,
   ||n...@freebsd.org

--- Comment #1 from NGie Cooper  ---
It's not just arm64; amd64 does/did a horrible job at yielding when in the
debugger (part of the reason why we have a script which goes and suspends test
VMs at $work if/when they panic).

Conrad had a patch out for amd64 a few months ago which yielded in the debugger
a bit on amd64, but IIRC there were issues at the time. I'll let him comment on
it though.

It would be nice if dropping into the debugger didn't spin all the CPUs at
~100% though.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204378] xhci fails on Cavium ThunderX

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204378

Bug ID: 204378
   Summary: xhci fails on Cavium ThunderX
   Product: Base System
   Version: 11.0-CURRENT
  Hardware: arm64
OS: Any
Status: New
  Severity: Affects Only Me
  Priority: ---
 Component: kern
  Assignee: freebsd-bugs@FreeBSD.org
  Reporter: ema...@freebsd.org

ThunderX XHCI requires MSI

I tried booting with:

hw.usb.xhci.use_polling=0
hw.usb.xhci.debug=255

and I see:

xhci0:  mem
0x8680-0x8680001f,0x86800020-0x8680002f at device 16.0 on
pci0^M
xhci_init: ^M
xhci_init: CAPLENGTH=0x20^M
xhci_init: RUNTIMEOFFSET=0x440^M
xhci_init: DOOROFFSET=0x480^M
xhci_init: xHCI version = 0x0100^M
xhci_init: HCS0 = 0x0220f665^M
xhci0: 64 bytes context size, 64-bit DMA^M
xhci_init: Max slots: 64^M
xhci_init: HCS2=0x0cf1^M
xhci_init: Max scratch: 1^M
xhci0: Could not allocate IRQ^M
xhci_halt_controller: ^M
device_attach: xhci0 attach returned 6^M
xhci0:  mem
0x8690-0x8690001f,0x86900020-0x8690002f at device 17.0 on
pci0^M
xhci_init: ^M
xhci_init: CAPLENGTH=0x20^M
xhci_init: RUNTIMEOFFSET=0x440^M
xhci_init: DOOROFFSET=0x480^M
xhci_init: xHCI version = 0x0100^M
xhci_init: HCS0 = 0x0220f665^M
xhci0: 64 bytes context size, 64-bit DMA^M
xhci_init: Max slots: 64^M
xhci_init: HCS2=0x0cf1^M
xhci_init: Max scratch: 1^M
xhci0: Could not allocate IRQ^M
xhci_halt_controller: ^M
device_attach: xhci0 attach returned 6^M

I tried booting with xhci polling:

set hw.usb.xhci.use_polling=1
set hw.usb.xhci.debug=255

and I see:

xhci0:  mem
0x8680-0x8680001f,0x86800020-0x8680002f at device 16.0 on
pci0^M
xhci_init: ^M
xhci_init: CAPLENGTH=0x20^M
xhci_init: RUNTIMEOFFSET=0x440^M
xhci_init: DOOROFFSET=0x480^M
xhci_init: xHCI version = 0x0100^M
xhci_init: HCS0 = 0x0220f665^M
xhci0: 64 bytes context size, 64-bit DMA^M
xhci_init: Max slots: 64^M
xhci_init: HCS2=0x0cf1^M
xhci_init: Max scratch: 1^M
xhci0: Could not allocate IRQ^M
xhci0: Interrupt polling at 1000Hz^M
xhci_interrupt: real interrupt (status=0x0001)^M
xhci_interrupt: host controller halted^M
xhci_halt_controller: ^M
xhci_start_controller: ^M
xhci_start_controller: CONFIG=0x -> 0x0040^M
xhci_start_controller: ERSTSZ=0x -> 0x0001^M
xhci_start_controller: ERDP(0)=0x07600080^M
xhci_start_controller: ERSTBA(0)=0x0760^M
xhci_start_controller: CRCR=0x07600d80^M
usbus0 on xhci0^M
xhci0: usbpf: Attached^M
random: harvesting attach, 8 bytes (4 bits) from usbus0^M
random: harvesting attach, 8 bytes (4 bits) from xhci0^M
xhci1:  mem
0x8690-0x8690001f,0x86900020-0x8690002f at device 17.0 on
pci0^M
xhci_init: ^M
xhci_init: CAPLENGTH=0x20^M
xhci_init: RUNTIMEOFFSET=0x440^M
xhci_init: DOOROFFSET=0x480^M
xhci_init: xHCI version = 0x0100^M
xhci_init: HCS0 = 0x0220f665^M
xhci1: 64 bytes context size, 64-bit DMA^M
xhci_init: Max slots: 64^M
xhci_init: HCS2=0x0cf1^M
xhci_init: Max scratch: 1^M
xhci1: Could not allocate IRQ^M
xhci1: Interrupt polling at 1000Hz^M
xhci_interrupt: real interrupt (status=0x0001)^M
xhci_interrupt: host controller halted^M
xhci_halt_controller: ^M
xhci_start_controller: ^M
xhci_start_controller: CONFIG=0x -> 0x0040^M
xhci_start_controller: ERSTSZ=0x -> 0x0001^M
xhci_start_controller: ERDP(0)=0x066cd080^M
xhci_start_controller: ERSTBA(0)=0x066cd000^M
xhci_start_controller: CRCR=0x066cdd80^M
usbus1 on xhci1^M
xhci1: usbpf: Attached^M
random: harvesting attach, 8 bytes (4 bits) from usbus1^M
random: harvesting attach, 8 bytes (4 bits) from xhci1^M

and then every few seconds:

xhci_set_hw_power: 
xhci_set_hw_power: 
xhci_set_hw_power: 
xhci_set_hw_power: 
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0001
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0001
xhci_roothub_exec: UR_GET_STATUS i=1
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0001
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: UR_GET_STATUS i=1
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0002
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: UR_GET_STATUS i=2
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0002
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: UR_GET_STATUS i=2
xhci_roothub_exec: UR_GET_STATUS i=1
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0001
xhci_roothub_exec: type=0xa3 request=0x00 wLen=0x0004 wValue=0x
wIndex=0x0002
xhci_roothub_exec: UR_GET_STATUS i=1
xhci_roothub_exec: UR_GET_STATUS i=2
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: port status=0x02a0
xhci_roothub_exec: type=0xa3 request=0x00 

[Bug 204376] Cavium ThunderX system heavily loaded while at db> prompt

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

Ed Maste  changed:

   What|Removed |Added

Summary|System heavily loaded while |Cavium ThunderX system
   |at db> prompt   |heavily loaded while at db>
   ||prompt

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204376] System heavily loaded while at db> prompt

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

Ed Maste  changed:

   What|Removed |Added

   Hardware|Any |arm64

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204376] System heavily loaded while at db> prompt

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204376

Bug ID: 204376
   Summary: System heavily loaded while at db> prompt
   Product: Base System
   Version: 11.0-CURRENT
  Hardware: Any
OS: Any
Status: New
  Severity: Affects Only Me
  Priority: ---
 Component: kern
  Assignee: freebsd-bugs@FreeBSD.org
  Reporter: ema...@freebsd.org

I don't have a way to directly quantify this, but have observed it as a
side-effect. I have a Cavium ThunderX 96-core system beside me in a hotel room.
While operating normally the system fans are reasonably quiet, but when the
system panics the they start increasing in speed as the system heats up. After
a short while are producing enough noise that I cannot leave the system
running, ending my debug session.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204326] [patch] sys/dev/oce bugged promiscuous mode

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204326

Mark Linimon  changed:

   What|Removed |Added

   Assignee|freebsd-bugs@FreeBSD.org|freebsd-...@freebsd.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204358] zfs loader zfs_probe_args secsz is too small, causing memory corruption

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204358

Mark Linimon  changed:

   What|Removed |Added

   Assignee|freebsd-bugs@FreeBSD.org|freebsd...@freebsd.org
   Keywords||patch

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 204371] lib/libmd/sha1c.c - silence a compiler warning

2015-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204371

Mark Linimon  changed:

   What|Removed |Added

   Keywords||patch

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"