Re: Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Steven G. Kargl
On Wed, Dec 14, 2016 at 04:50:21PM -0800, Mark Johnston wrote:
> On Wed, Dec 14, 2016 at 03:48:04PM -0800, Steven G. Kargl wrote:
> > On Wed, Dec 14, 2016 at 02:10:48PM -0800, Mark Johnston wrote:
> > > On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote:
> > > > On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote:
> > > > > Well, after 3 days of bisection, I finally found the commit
> > > > > that renders my system unbootable.  The system does not panic.
> > > > > It simply gets stuck in some state.  Nonfunctional keyboard,
> > > > > so can't break into debugger.  No serial console available.
> > > > > The verbose dmesg.boot for a working kernel from revision
> > > > > 309656 is at
> > > > > 
> > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt
> > > > > 
> > > > > The kernel config file is at
> > > > > 
> > > > > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt
> > > > > 
> > > > > In looking at /usr/src/UPDATING, there is no warning that one
> > > > > can create a boat anchor by upgrading to 309657.  If compiling
> > > > > a kernel with 'options DDB' is no longer supported, this should
> > > > > be stated in UPDATING.  Or, UPDATING should state that 'options
> > > > > DDB' requires 'options STACK'.  Or, 'options DDB' should simply
> > > > > to the right thing and pull in whatever 'option STACK' does. 
> > > > 
> > > > It is supported though - the point of that change was to fix a problem
> > > > that occurred when DDB is configured but STACK isn't. While testing I
> > > > tried every combination of the two options, and I just tried and
> > > > successfully booted a kernel with DDB and !STACK.
> > > > 
> > > > Does the kernel boot successfully if STACK is added to your
> > > > configuration?
> > > 
> > > I tried your config (plus virtio drivers) and was able to reproduce the
> > > hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of
> > > my change to re-add dead code into the kernel. My VM was always hanging
> > > after printing
> > > 
> > > 000.50 [ 426] vtnet_netmap_attach   virtio attached txq=1, 
> > > txd=1024 rxq=1, rxd=1024
> > > 
> > > Sure enough, removing "device netmap" from your config also fixes the
> > > hang. When the hang occurs, I can see with "bhyvectl --get-rip" that
> > > we're stuck in DELAY(), but I can't get a stack at that point. I think
> > > my change is an innocent bystander - it just happened to expose a latent
> > > issue elsewhere.
> > > 
> > > I don't have much more time to look at this right now, but I'll look
> > > into it more tonight.
> > 
> > Yes, adding STACK got me to a booting kernel.  I can't remember
> > why I added netmap to my config file.  Re-adding dead code seems to
> > point to some memory corruption issue or a rogue pointer. :(
> 
> It's not quite that bad, as it turns out. The key is that
> adding/removing the dead code changes the ordering of the items in the
> sysinit linker set. I discovered that if the ctl(4) module is
> initialized before the vtnet driver attaches, the hang occurs, and
> reverting my commit results in a sysinit order where vtnet comes
> _before_ ctl(4). So my change triggers the problem just because it
> happens to perturb something in the compile-time linker.

Thanks for the explanation.

> The issue actually seems to be in 4BSD, and more specifically in r308564
> and r308565. Switching to ULE or reverting either of those two commits
> fixes the hang.

Oh, this is bad.  The last time I checked (and it has been
awhile ago), ULE has/had some very bad performance issues
for numerical computations that use OpenMPI (or likely any
MPI implementation) if a node becomes oversubscribed.  4BSD
at least manages to recover.

Thanks for the pointer to r308564 and 65.  I'll take a look
later tonight as I've managed to break both firefox and chrome
during the upgrade.

-- 
Steve
http://troutmask.apl.washington.edu/~kargl/
https://www.youtube.com/watch?v=6hwgPfCcpyQ
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Steven G. Kargl
On Wed, Dec 14, 2016 at 02:10:48PM -0800, Mark Johnston wrote:
> On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote:
> > On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote:
> > > Well, after 3 days of bisection, I finally found the commit
> > > that renders my system unbootable.  The system does not panic.
> > > It simply gets stuck in some state.  Nonfunctional keyboard,
> > > so can't break into debugger.  No serial console available.
> > > The verbose dmesg.boot for a working kernel from revision
> > > 309656 is at
> > > 
> > > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt
> > > 
> > > The kernel config file is at
> > > 
> > > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt
> > > 
> > > In looking at /usr/src/UPDATING, there is no warning that one
> > > can create a boat anchor by upgrading to 309657.  If compiling
> > > a kernel with 'options DDB' is no longer supported, this should
> > > be stated in UPDATING.  Or, UPDATING should state that 'options
> > > DDB' requires 'options STACK'.  Or, 'options DDB' should simply
> > > to the right thing and pull in whatever 'option STACK' does. 
> > 
> > It is supported though - the point of that change was to fix a problem
> > that occurred when DDB is configured but STACK isn't. While testing I
> > tried every combination of the two options, and I just tried and
> > successfully booted a kernel with DDB and !STACK.
> > 
> > Does the kernel boot successfully if STACK is added to your
> > configuration?
> 
> I tried your config (plus virtio drivers) and was able to reproduce the
> hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of
> my change to re-add dead code into the kernel. My VM was always hanging
> after printing
> 
> 000.50 [ 426] vtnet_netmap_attach   virtio attached txq=1, txd=1024 
> rxq=1, rxd=1024
> 
> Sure enough, removing "device netmap" from your config also fixes the
> hang. When the hang occurs, I can see with "bhyvectl --get-rip" that
> we're stuck in DELAY(), but I can't get a stack at that point. I think
> my change is an innocent bystander - it just happened to expose a latent
> issue elsewhere.
> 
> I don't have much more time to look at this right now, but I'll look
> into it more tonight.

Yes, adding STACK got me to a booting kernel.  I can't remember
why I added netmap to my config file.  Re-adding dead code seems to
point to some memory corruption issue or a rogue pointer. :(

BTW, I think it would be prudent to add something like 

  20161206:
 At revision 309657, 'options STACK' was introduced into
 sys/x86/x86/mstack_machdep.c.  Old kernel configuration files
 that included 'options DDB' are now required to include also
 'options STACK'.

to UPDATING or some such wording.  I was jumping from circ Oct 10th world
to top of tree, and got caught by ~3000 commits.

Oh, and thanks for the work you've done on FreeBSD.

-- 
Steve
http://troutmask.apl.washington.edu/~kargl/
https://www.youtube.com/watch?v=6hwgPfCcpyQ
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Steven G. Kargl
Well, after 3 days of bisection, I finally found the commit
that renders my system unbootable.  The system does not panic.
It simply gets stuck in some state.  Nonfunctional keyboard,
so can't break into debugger.  No serial console available.
The verbose dmesg.boot for a working kernel from revision
309656 is at

http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt

The kernel config file is at

http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt

In looking at /usr/src/UPDATING, there is no warning that one
can create a boat anchor by upgrading to 309657.  If compiling
a kernel with 'options DDB' is no longer supported, this should
be stated in UPDATING.  Or, UPDATING should state that 'options
DDB' requires 'options STACK'.  Or, 'options DDB' should simply
to the right thing and pull in whatever 'option STACK' does. 

-- 
Steve
http://troutmask.apl.washington.edu/~kargl/
https://www.youtube.com/watch?v=6hwgPfCcpyQ
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Mark Johnston
On Wed, Dec 14, 2016 at 03:48:04PM -0800, Steven G. Kargl wrote:
> On Wed, Dec 14, 2016 at 02:10:48PM -0800, Mark Johnston wrote:
> > On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote:
> > > On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote:
> > > > Well, after 3 days of bisection, I finally found the commit
> > > > that renders my system unbootable.  The system does not panic.
> > > > It simply gets stuck in some state.  Nonfunctional keyboard,
> > > > so can't break into debugger.  No serial console available.
> > > > The verbose dmesg.boot for a working kernel from revision
> > > > 309656 is at
> > > > 
> > > > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt
> > > > 
> > > > The kernel config file is at
> > > > 
> > > > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt
> > > > 
> > > > In looking at /usr/src/UPDATING, there is no warning that one
> > > > can create a boat anchor by upgrading to 309657.  If compiling
> > > > a kernel with 'options DDB' is no longer supported, this should
> > > > be stated in UPDATING.  Or, UPDATING should state that 'options
> > > > DDB' requires 'options STACK'.  Or, 'options DDB' should simply
> > > > to the right thing and pull in whatever 'option STACK' does. 
> > > 
> > > It is supported though - the point of that change was to fix a problem
> > > that occurred when DDB is configured but STACK isn't. While testing I
> > > tried every combination of the two options, and I just tried and
> > > successfully booted a kernel with DDB and !STACK.
> > > 
> > > Does the kernel boot successfully if STACK is added to your
> > > configuration?
> > 
> > I tried your config (plus virtio drivers) and was able to reproduce the
> > hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of
> > my change to re-add dead code into the kernel. My VM was always hanging
> > after printing
> > 
> > 000.50 [ 426] vtnet_netmap_attach   virtio attached txq=1, txd=1024 
> > rxq=1, rxd=1024
> > 
> > Sure enough, removing "device netmap" from your config also fixes the
> > hang. When the hang occurs, I can see with "bhyvectl --get-rip" that
> > we're stuck in DELAY(), but I can't get a stack at that point. I think
> > my change is an innocent bystander - it just happened to expose a latent
> > issue elsewhere.
> > 
> > I don't have much more time to look at this right now, but I'll look
> > into it more tonight.
> 
> Yes, adding STACK got me to a booting kernel.  I can't remember
> why I added netmap to my config file.  Re-adding dead code seems to
> point to some memory corruption issue or a rogue pointer. :(

It's not quite that bad, as it turns out. The key is that
adding/removing the dead code changes the ordering of the items in the
sysinit linker set. I discovered that if the ctl(4) module is
initialized before the vtnet driver attaches, the hang occurs, and
reverting my commit results in a sysinit order where vtnet comes
_before_ ctl(4). So my change triggers the problem just because it
happens to perturb something in the compile-time linker.

> 
> BTW, I think it would be prudent to add something like 
> 
>   20161206:
>  At revision 309657, 'options STACK' was introduced into
>  sys/x86/x86/mstack_machdep.c.  Old kernel configuration files
>  that included 'options DDB' are now required to include also
>  'options STACK'.
> 
> to UPDATING or some such wording.  I was jumping from circ Oct 10th world
> to top of tree, and got caught by ~3000 commits.

The issue actually seems to be in 4BSD, and more specifically in r308564
and r308565. Switching to ULE or reverting either of those two commits
fixes the hang. Here's what happens:

1. ctl_init() runs and creates ctl_thresh_thread. This thread's main
   loop cause pause(9) when it has no work to do. During boot, pause(9)
   just calls DELAY() and does not yield the CPU.
2. thread0 attaches the vtnet driver. As part of this, it creates and
   starts some high-priority taskqueue threads in
   _taskqueue_thread_start(). They're added to the scheduler with:

   thread_lock();
   sched_pri(...);
   sched_add(...);
   thread_unlock();

   4BSD's sched_add() will call maybe_preempt() in this case, which as
   of r308564 will unconditionally set td_owepreempt in the current
   thread.
3. thread_unlock() will release the critical section held by the current
   thread and because td_owepreempt is set, we'll yield the CPU. The
   taskqueue threads have nothing to do, but ctl_thresh_thread runs
   and ends up busy-waiting in pause() forever.

r308565 removes a check in maybe_preempt() that would have stopped
td_owepreempt from being set. Before r308564, maybe_preempt() would have
switched directly to the new thread and apparently always switched back
immediately.

I'm not sure what the correct fix is - jhb might have an idea. I wonder
if pause() should try to yield periodically when called during boot.
___

Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread A. Wilcox
On 14/12/16 13:48, Slawa Olhovchenkov wrote:
> On Wed, Dec 14, 2016 at 09:43:24PM +0200, Konstantin Belousov wrote:
> 
>> On Wed, Dec 14, 2016 at 10:29:43PM +0300, Slawa Olhovchenkov wrote:
>>> On Wed, Dec 14, 2016 at 09:03:49PM +0200, Konstantin Belousov wrote:
>>>
 On Wed, Dec 14, 2016 at 06:26:27PM +0300, Slawa Olhovchenkov wrote:
> For test hardware setup (NUMA+interleave), what ISO I can try to boot?
 Didn't you already tried ?
>>>
>>> Different from FreeBSD.
>> Can you reformulate the statement ?
>> Did you booted some other (non-FreeBSD) OS and it hung with that options
>> as well ?
> 
> No, I don't try now, can you advice some OS for test?

Ugh, Supermicro is big pain.

Try CentOS, also try Debian.  Just to see.  Maybe you get lucky, and one
of them hangs too... Debian usually runs older kernels, so more likely
to not have a workaround.

Best solution: new mainboard vendor, until Supermicro works out their
dumb firmware and makes it less dumb. :(

--arw


-- 
A. Wilcox (awilfox)
Open-source programmer (C, C++, Python)
https://code.foxkit.us/u/awilfox/



signature.asc
Description: OpenPGP digital signature


Re: Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Mark Johnston
On Wed, Dec 14, 2016 at 12:14:16PM -0800, Mark Johnston wrote:
> On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote:
> > Well, after 3 days of bisection, I finally found the commit
> > that renders my system unbootable.  The system does not panic.
> > It simply gets stuck in some state.  Nonfunctional keyboard,
> > so can't break into debugger.  No serial console available.
> > The verbose dmesg.boot for a working kernel from revision
> > 309656 is at
> > 
> > http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt
> > 
> > The kernel config file is at
> > 
> > http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt
> > 
> > In looking at /usr/src/UPDATING, there is no warning that one
> > can create a boat anchor by upgrading to 309657.  If compiling
> > a kernel with 'options DDB' is no longer supported, this should
> > be stated in UPDATING.  Or, UPDATING should state that 'options
> > DDB' requires 'options STACK'.  Or, 'options DDB' should simply
> > to the right thing and pull in whatever 'option STACK' does. 
> 
> It is supported though - the point of that change was to fix a problem
> that occurred when DDB is configured but STACK isn't. While testing I
> tried every combination of the two options, and I just tried and
> successfully booted a kernel with DDB and !STACK.
> 
> Does the kernel boot successfully if STACK is added to your
> configuration?

I tried your config (plus virtio drivers) and was able to reproduce the
hang in bhyve. Adding STACK "fixed" the hang, as did reverting part of
my change to re-add dead code into the kernel. My VM was always hanging
after printing

000.50 [ 426] vtnet_netmap_attach   virtio attached txq=1, txd=1024 
rxq=1, rxd=1024

Sure enough, removing "device netmap" from your config also fixes the
hang. When the hang occurs, I can see with "bhyvectl --get-rip" that
we're stuck in DELAY(), but I can't get a stack at that point. I think
my change is an innocent bystander - it just happened to expose a latent
issue elsewhere.

I don't have much more time to look at this right now, but I'll look
into it more tonight.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Revision 309657 to stack_machdep.c renders unbootable system

2016-12-14 Thread Mark Johnston
On Wed, Dec 14, 2016 at 11:49:26AM -0800, Steven G. Kargl wrote:
> Well, after 3 days of bisection, I finally found the commit
> that renders my system unbootable.  The system does not panic.
> It simply gets stuck in some state.  Nonfunctional keyboard,
> so can't break into debugger.  No serial console available.
> The verbose dmesg.boot for a working kernel from revision
> 309656 is at
> 
> http://troutmask.apl.washington.edu/~kargl/freebsd/dmesg.309656.txt
> 
> The kernel config file is at
> 
> http://troutmask.apl.washington.edu/~kargl/freebsd/SPEW.txt
> 
> In looking at /usr/src/UPDATING, there is no warning that one
> can create a boat anchor by upgrading to 309657.  If compiling
> a kernel with 'options DDB' is no longer supported, this should
> be stated in UPDATING.  Or, UPDATING should state that 'options
> DDB' requires 'options STACK'.  Or, 'options DDB' should simply
> to the right thing and pull in whatever 'option STACK' does. 

It is supported though - the point of that change was to fix a problem
that occurred when DDB is configured but STACK isn't. While testing I
tried every combination of the two options, and I just tried and
successfully booted a kernel with DDB and !STACK.

Does the kernel boot successfully if STACK is added to your
configuration?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 09:43:24PM +0200, Konstantin Belousov wrote:

> On Wed, Dec 14, 2016 at 10:29:43PM +0300, Slawa Olhovchenkov wrote:
> > On Wed, Dec 14, 2016 at 09:03:49PM +0200, Konstantin Belousov wrote:
> > 
> > > On Wed, Dec 14, 2016 at 06:26:27PM +0300, Slawa Olhovchenkov wrote:
> > > > For test hardware setup (NUMA+interleave), what ISO I can try to boot?
> > > Didn't you already tried ?
> > 
> > Different from FreeBSD.
> Can you reformulate the statement ?
> Did you booted some other (non-FreeBSD) OS and it hung with that options
> as well ?

No, I don't try now, can you advice some OS for test?

> > For sure about firmware problem and complains to Supermicro.
> > I think FreeBSD problem don't be accepted by Supermicro.
> You never know.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Wed, Dec 14, 2016 at 10:29:43PM +0300, Slawa Olhovchenkov wrote:
> On Wed, Dec 14, 2016 at 09:03:49PM +0200, Konstantin Belousov wrote:
> 
> > On Wed, Dec 14, 2016 at 06:26:27PM +0300, Slawa Olhovchenkov wrote:
> > > For test hardware setup (NUMA+interleave), what ISO I can try to boot?
> > Didn't you already tried ?
> 
> Different from FreeBSD.
Can you reformulate the statement ?
Did you booted some other (non-FreeBSD) OS and it hung with that options
as well ?

> For sure about firmware problem and complains to Supermicro.
> I think FreeBSD problem don't be accepted by Supermicro.
You never know.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 09:03:49PM +0200, Konstantin Belousov wrote:

> On Wed, Dec 14, 2016 at 06:26:27PM +0300, Slawa Olhovchenkov wrote:
> > On Wed, Dec 14, 2016 at 03:13:36PM +0300, Slawa Olhovchenkov wrote:
> > 
> > > On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:
> > > 
> > > > In other words, it is almost certainly the hang and not a fault causing
> > > > hang. This means that the machine is not compliant with the IA32
> > > > architecture, in particular, the region reported as normal memory by
> > > > E820 BIOS service does not behave as normal memory.
> > > > 
> > > > Since regardless of the option setting, the memory map is same, and
> > > > bootstrap page table only depend on the memory map, we use the same page
> > > > table when hanging and when operating correctly. We do not fault or hang
> > > > when the option is turned off, which together with the improved early
> > > > fault handling in the patch, makes it almost certain that the problem is
> > > > in hardware configuration and not in our early setup.
> > > > 
> > > > Of course, the most puzzling part is that memory test makes the hang
> > > > go away, while repeating memory test operation only on the msgbuf region
> > > > does not. msgbuf is special in that it is located at TOHM (top of high
> > > > memory). It spans 128KB from below it to the last byte of the last
> > > > physical segment.
> > > > 
> > > > The only ideas I have right now is that there is either a bug in the
> > > > Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> > > > is nothing OS can do to mitigate it.  Or it might be that the memory
> > > > map reported by CMS is wrong (you said that you use legacy boot, right
> > > > ?).  This is not too surprising if true, because non-EFI boot code path
> > > > definitely get less and less testing.
> > > > 
> > > > For the later case (potential bug in CMS), could you switch to EFI boot
> > > > mode and see whether the issue magically healths itself ?  You could 
> > > > boot
> > > > from USB stick in EFI mode without reinstalling for test.
> > > 
> > > I can't boot from USB stick -- this is remote DC and IPMI allow only
> > > CDROM emulation.
> > > 
> > > OK, I am boot in UEFI 12.0 snapshot ISO.
> > > Boot ok.
> > 
> > Sorry. Overload bu work and test wrong combination (NUMA=ON,
> > interleave=OFF)
> > 
> > snapshot iso don't boot with NUMA=ON interleave=ON
> Ok.
> 
> > 
> > For test hardware setup (NUMA+interleave), what ISO I can try to boot?
> Didn't you already tried ?

Different from FreeBSD.
For sure about firmware problem and complains to Supermicro.
I think FreeBSD problem don't be accepted by Supermicro.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Wed, Dec 14, 2016 at 06:26:27PM +0300, Slawa Olhovchenkov wrote:
> On Wed, Dec 14, 2016 at 03:13:36PM +0300, Slawa Olhovchenkov wrote:
> 
> > On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:
> > 
> > > In other words, it is almost certainly the hang and not a fault causing
> > > hang. This means that the machine is not compliant with the IA32
> > > architecture, in particular, the region reported as normal memory by
> > > E820 BIOS service does not behave as normal memory.
> > > 
> > > Since regardless of the option setting, the memory map is same, and
> > > bootstrap page table only depend on the memory map, we use the same page
> > > table when hanging and when operating correctly. We do not fault or hang
> > > when the option is turned off, which together with the improved early
> > > fault handling in the patch, makes it almost certain that the problem is
> > > in hardware configuration and not in our early setup.
> > > 
> > > Of course, the most puzzling part is that memory test makes the hang
> > > go away, while repeating memory test operation only on the msgbuf region
> > > does not. msgbuf is special in that it is located at TOHM (top of high
> > > memory). It spans 128KB from below it to the last byte of the last
> > > physical segment.
> > > 
> > > The only ideas I have right now is that there is either a bug in the
> > > Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> > > is nothing OS can do to mitigate it.  Or it might be that the memory
> > > map reported by CMS is wrong (you said that you use legacy boot, right
> > > ?).  This is not too surprising if true, because non-EFI boot code path
> > > definitely get less and less testing.
> > > 
> > > For the later case (potential bug in CMS), could you switch to EFI boot
> > > mode and see whether the issue magically healths itself ?  You could boot
> > > from USB stick in EFI mode without reinstalling for test.
> > 
> > I can't boot from USB stick -- this is remote DC and IPMI allow only
> > CDROM emulation.
> > 
> > OK, I am boot in UEFI 12.0 snapshot ISO.
> > Boot ok.
> 
> Sorry. Overload bu work and test wrong combination (NUMA=ON,
> interleave=OFF)
> 
> snapshot iso don't boot with NUMA=ON interleave=ON
Ok.

> 
> For test hardware setup (NUMA+interleave), what ISO I can try to boot?
Didn't you already tried ?

> 
> PS: memmaps:
> 
> NUMA=ON interleave=OFF
> OK memmap
>Type Physical  Virtual   #Pages Attr
>BootServicesCode   0008 UC WC WT WB
>  ConventionalMemory 8000  0027 UC WC WT WB
>BootServicesData 0002f000  0011 UC WC WT WB
>BootServicesCode 0004  0060 UC WC WT WB
>  ConventionalMemory 0010  000660a3 UC WC WT WB
>BootServicesData 661a3000  0080 UC WC WT WB
>  ConventionalMemory 66223000  76b8 UC WC WT WB
>  LoaderData 6d8db000  8000 UC WC WT WB
>  LoaderCode 758db000  0070 UC WC WT WB
>BootServicesData 7594b000  3220 UC WC WT WB
>  ConventionalMemory 78b6b000  028e UC WC WT WB
>BootServicesCode 78df9000  0372 UC WC WT WB
>Reserved 7916b000  0817 UC WC WT WB
>  ConventionalMemory 79982000  011f UC WC WT WB
>   ACPIMemoryNVS 79aa1000  0509 UC WC WT WB
> RuntimeServicesData 79faa000  1dbd UC WC WT WB
> RuntimeServicesCode 7bd67000  0061 UC WC WT WB
>BootServicesData 7bdc8000  0001 UC WC WT WB
> RuntimeServicesData 7bdc9000  0086 UC WC WT WB
>BootServicesData 7be4f000  01b1 UC WC WT WB
>  ConventionalMemory 0001  01f8 UC WC WT WB
>Reserved 7c00  4000
>  MemoryMappedIO 8000  0001 UC
>  MemoryMappedIO fed1c000  0029 UC
>  MemoryMappedIO ff00  1000 UC
> 
> NUMA=ON interleave=ON
>Type Physical  Virtual   #Pages Attr
>BootServicesCode   0008 UC WC WT WB
>  ConventionalMemory 8000  0027 UC WC WT WB
>BootServicesData 0002f000  0011 UC WC WT WB
>BootServicesCode 0004  0060 UC WC WT WB
>  ConventionalMemory 0010  000660a3 UC WC WT WB
>BootServicesData 661a3000  0080 UC WC WT WB
>  ConventionalMemory 66223000  76b8 UC WC WT WB
>  LoaderData 6d8db000  8000 UC WC WT WB
>  LoaderCode 758db000 

Re: [RFC/RFT] projects/ipsec

2016-12-14 Thread Eugene M. Zheganin

Hi,

On 11.12.2016 4:07, Andrey V. Elsukov wrote:

Hi All,

I am pleased to announce that projects/ipsec, that I started several
months ago is ready for testing and review.
The main goals were:
   * rework locking to make IPsec code more friendly for concurrent
 processing;
   * make lookup in SADB/SPDB faster;
   * revise PFKEY implementation, remove stale code, make it closer
 to RFC;
   * implement IPsec VTI (virtual tunneling interface);
   * make IPsec code loadable as kernel module.

Currently all, except the last one is mostly done. So, I decided ask for
a help to test the what already done, while I will work on the last task.

Well, at last FreeBSD got one of the most anticipated features in it's 
ipsec stack. When I wrote the message in the freebsd-net ML in the 
middle of 2012 
(https://lists.freebsd.org/pipermail/freebsd-net/2012-June/032556.html) 
I had a very little hope that someone will actually implement this, and 
now I'm very grateful that Andrey got the time to do this (and I'm 
really sorry for being such a pain in the ass, I'm saying so because I 
was bothering Andrey all this time in IRC). This isn't definitely a 
feature that every FreeBSD enthusiast will use, and, sadly, even not the 
feature that every network engineer that use ipsec in it's every day 
work will configure (many people still use obsoleted legacy 
interfaceless ipsec approach, not to mention weird and hybrid software 
routers like openvpn), but it's definitely a feature that will be 
appreciated by every skilled L3 VPN engineer that is using FreeBSD in 
it's operating stack. I've ran some tests in my production network and I 
should say that even on it's initial release state if_ipsec is fully 
operational with Juniper st tunnel on the other side, so I'm already 
running one FreeBSD <--> Juniper tunnel at my work:


# ifconfig ipsec0
ipsec0: flags=8051 metric 0 mtu 1400
tunnel inet 128.127.144.19 --> 128.127.146.1
inet 172.16.3.104 --> 172.16.3.105 netmask 0x
inet6 fe80::204:23ff:fec7:194d%ipsec0 prefixlen 64 scopeid 0x9
nd6 options=21
reqid: 16385
groups: ipsec

racoon.conf:
path pre_shared_key "/usr/local/etc/racoon/psk.txt";

padding {
maximum_length 20; # maximum padding length.
randomize off; # enable randomize length.
strict_check off; # enable strict check.
exclusive_tail off; # extract last one octet.
}

listen {
isakmp 128.127.144.19 [500];
strict_address; # requires that all addresses must be bound.
}

timer {
counter 5;
interval 20 sec;
persend 1;

phase1 30 sec;
phase2 15 sec;
}

#
# SPb, Test
#

remote 128.127.146.1 {
exchange_mode main;
lifetime time 1 hour;
my_identifier address 128.127.144.19;
peers_identifier address 128.127.146.1;
passive off;
proposal_check obey;
dpd_delay 20;
proposal {
encryption_algorithm des;
hash_algorithm md5;
authentication_method pre_shared_key;
dh_group modp768;
}
}

#
# SPb, Test
#

sainfo address 0.0.0.0/0 [500] any address 0.0.0.0/0 [500] any {
pfs_group modp768;
lifetime time 60 min;
encryption_algorithm des;
authentication_algorithm non_auth;
compression_algorithm deflate;
}

Juniper side:

> show configuration interfaces st0.147
description "Perm, FreeBSD Test Server";
family inet {
mtu 1455;
address 172.16.3.105/32 {
destination 172.16.3.104;
}
}

> show configuration security ike policy kosm65
proposals norma-ike;
pre-shared-key ascii-text 
"$9$-SV4ZUDkqPQUjBIclLXgoJUqf9CuESeAp-w2gGUjHqfQn"; ## SECRET-DATA


> show configuration security ike gateway kosm65-freebsd-test
ike-policy perm-freebsd-test;
address 128.127.144.19;
local-identity inet 128.127.146.1;
remote-identity inet 128.127.144.19;
external-interface reth1.2;

> show configuration security ipsec vpn kosm65-freebsd-test
bind-interface st0.147;
ike {
gateway kosm65-freebsd-test;
ipsec-policy norma-policy;
}

> show configuration security ipsec policy norma-policy
perfect-forward-secrecy {
keys group1;
}
proposals norma-ipsec;

> show configuration security ipsec proposal norma-ipsec
protocol esp;
encryption-algorithm des-cbc;
lifetime-seconds 600;

> show configuration security ike proposal norma-ike
authentication-method pre-shared-keys;
dh-group group1;
authentication-algorithm md5;
encryption-algorithm des-cbc;

In it's initial state if_ipsec allows to use only one set of encryption 
parameters (because only one sainfo anonyumous is possible), so at this 
time it doesn't allow to create multiple tunnels with VPN hubs that use 
different cipers and/or transform sets, but as far as I understand this 
is subject to change and Andrey is already working on a support of this 
feature from ipsec-tools IKE daemon. But even in this state this feature 
is already useful and I'm excited to see it commited to HEAD and then 
MFC'd 

FreeBSD_HEAD_i386 - Build #4384 - Fixed

2016-12-14 Thread jenkins-admin
FreeBSD_HEAD_i386 - Build #4384 - Fixed:

Build information: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4384/
Full change log: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4384/changes
Full build log: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4384/console

Change summaries:

310061 by manu:
Add new compatible string "allwinner,sun7i-a20-mmc".

New upstream DTS is using this now for A20 SoC.

MFC after:  3 days

310058 by hselasky:
Fix initialisation of mlx4_pci_table's .driver_data fields.

MFC after:  1 week
Differential Revision:  https://reviews.freebsd.org/D8791
Sponsored by:   Mellanox Technologies
Submitted by:   Dexuan Cui 

310057 by ed:
Revert accidental change made in r310056.

Because I had to cherry-pick some of my changes in r310051, I
accidentally made a typo when manually applying the rest in r310056.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 03:13:36PM +0300, Slawa Olhovchenkov wrote:

> On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:
> 
> > In other words, it is almost certainly the hang and not a fault causing
> > hang. This means that the machine is not compliant with the IA32
> > architecture, in particular, the region reported as normal memory by
> > E820 BIOS service does not behave as normal memory.
> > 
> > Since regardless of the option setting, the memory map is same, and
> > bootstrap page table only depend on the memory map, we use the same page
> > table when hanging and when operating correctly. We do not fault or hang
> > when the option is turned off, which together with the improved early
> > fault handling in the patch, makes it almost certain that the problem is
> > in hardware configuration and not in our early setup.
> > 
> > Of course, the most puzzling part is that memory test makes the hang
> > go away, while repeating memory test operation only on the msgbuf region
> > does not. msgbuf is special in that it is located at TOHM (top of high
> > memory). It spans 128KB from below it to the last byte of the last
> > physical segment.
> > 
> > The only ideas I have right now is that there is either a bug in the
> > Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> > is nothing OS can do to mitigate it.  Or it might be that the memory
> > map reported by CMS is wrong (you said that you use legacy boot, right
> > ?).  This is not too surprising if true, because non-EFI boot code path
> > definitely get less and less testing.
> > 
> > For the later case (potential bug in CMS), could you switch to EFI boot
> > mode and see whether the issue magically healths itself ?  You could boot
> > from USB stick in EFI mode without reinstalling for test.
> 
> I can't boot from USB stick -- this is remote DC and IPMI allow only
> CDROM emulation.
> 
> OK, I am boot in UEFI 12.0 snapshot ISO.
> Boot ok.

Sorry. Overload bu work and test wrong combination (NUMA=ON,
interleave=OFF)

snapshot iso don't boot with NUMA=ON interleave=ON

For test hardware setup (NUMA+interleave), what ISO I can try to boot?

PS: memmaps:

NUMA=ON interleave=OFF
OK memmap
   Type Physical  Virtual   #Pages Attr
   BootServicesCode   0008 UC WC WT WB
 ConventionalMemory 8000  0027 UC WC WT WB
   BootServicesData 0002f000  0011 UC WC WT WB
   BootServicesCode 0004  0060 UC WC WT WB
 ConventionalMemory 0010  000660a3 UC WC WT WB
   BootServicesData 661a3000  0080 UC WC WT WB
 ConventionalMemory 66223000  76b8 UC WC WT WB
 LoaderData 6d8db000  8000 UC WC WT WB
 LoaderCode 758db000  0070 UC WC WT WB
   BootServicesData 7594b000  3220 UC WC WT WB
 ConventionalMemory 78b6b000  028e UC WC WT WB
   BootServicesCode 78df9000  0372 UC WC WT WB
   Reserved 7916b000  0817 UC WC WT WB
 ConventionalMemory 79982000  011f UC WC WT WB
  ACPIMemoryNVS 79aa1000  0509 UC WC WT WB
RuntimeServicesData 79faa000  1dbd UC WC WT WB
RuntimeServicesCode 7bd67000  0061 UC WC WT WB
   BootServicesData 7bdc8000  0001 UC WC WT WB
RuntimeServicesData 7bdc9000  0086 UC WC WT WB
   BootServicesData 7be4f000  01b1 UC WC WT WB
 ConventionalMemory 0001  01f8 UC WC WT WB
   Reserved 7c00  4000
 MemoryMappedIO 8000  0001 UC
 MemoryMappedIO fed1c000  0029 UC
 MemoryMappedIO ff00  1000 UC

NUMA=ON interleave=ON
   Type Physical  Virtual   #Pages Attr
   BootServicesCode   0008 UC WC WT WB
 ConventionalMemory 8000  0027 UC WC WT WB
   BootServicesData 0002f000  0011 UC WC WT WB
   BootServicesCode 0004  0060 UC WC WT WB
 ConventionalMemory 0010  000660a3 UC WC WT WB
   BootServicesData 661a3000  0080 UC WC WT WB
 ConventionalMemory 66223000  76b8 UC WC WT WB
 LoaderData 6d8db000  8000 UC WC WT WB
 LoaderCode 758db000  0070 UC WC WT WB
   BootServicesData 7594b000  3220 UC WC WT WB
 ConventionalMemory 78b6b000  028e UC WC WT WB
   BootServicesCode 78df9000  0372 UC WC WT WB
   Reserved 7916b000 

Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 04:40:33PM +0200, Konstantin Belousov wrote:

> On Wed, Dec 14, 2016 at 03:13:36PM +0300, Slawa Olhovchenkov wrote:
> > On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:
> > 
> > > In other words, it is almost certainly the hang and not a fault causing
> > > hang. This means that the machine is not compliant with the IA32
> > > architecture, in particular, the region reported as normal memory by
> > > E820 BIOS service does not behave as normal memory.
> > > 
> > > Since regardless of the option setting, the memory map is same, and
> > > bootstrap page table only depend on the memory map, we use the same page
> > > table when hanging and when operating correctly. We do not fault or hang
> > > when the option is turned off, which together with the improved early
> > > fault handling in the patch, makes it almost certain that the problem is
> > > in hardware configuration and not in our early setup.
> > > 
> > > Of course, the most puzzling part is that memory test makes the hang
> > > go away, while repeating memory test operation only on the msgbuf region
> > > does not. msgbuf is special in that it is located at TOHM (top of high
> > > memory). It spans 128KB from below it to the last byte of the last
> > > physical segment.
> > > 
> > > The only ideas I have right now is that there is either a bug in the
> > > Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> > > is nothing OS can do to mitigate it.  Or it might be that the memory
> > > map reported by CMS is wrong (you said that you use legacy boot, right
> > > ?).  This is not too surprising if true, because non-EFI boot code path
> > > definitely get less and less testing.
> > > 
> > > For the later case (potential bug in CMS), could you switch to EFI boot
> > > mode and see whether the issue magically healths itself ?  You could boot
> > > from USB stick in EFI mode without reinstalling for test.
> > 
> > I can't boot from USB stick -- this is remote DC and IPMI allow only
> > CDROM emulation.
> > 
> > OK, I am boot in UEFI 12.0 snapshot ISO.
> > Boot ok.
> > 
> > Can I convert installed OS to UEFI mode?
> I am not sure what do you ask there.  Are you asking whether I need any
> further information from the broken setup ?  I believe that no, I cannot
> debug this any further.

I am don't touch UEFI before. I am try to know how to switch for
existing installtion from legacy boot to UEFI boot (for use less
broken setup).

[may be NUMA+interleaving don't give me any good, but I am need test
for sure]

> I think that the interesting piece of data that can be obtained now is
> the memmap command output from the EFI loader from all three configurations,
> NUMA on/off and interleaving.

What you mean 'EFI loader'?
FreeBSD loader for UEFI mode?
Or UEFI shell from BIOS?

> > 
> > > Do you use latest BIOS for your motherboard ?
> > 
> > This is new MB (X10DRi) w/ BIOS 2.0, new is 2.1 but update is not
> > simple (need to prepare bootable dos ISO, mostly utilites don't work
> > under FreeBSD).
> IMO the only way to fix this issue, if it is really important, is
> to contact supermicro and show them the bug.  But this only makes sense if
> repeated on the latest firmware version.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Wed, Dec 14, 2016 at 03:13:36PM +0300, Slawa Olhovchenkov wrote:
> On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:
> 
> > In other words, it is almost certainly the hang and not a fault causing
> > hang. This means that the machine is not compliant with the IA32
> > architecture, in particular, the region reported as normal memory by
> > E820 BIOS service does not behave as normal memory.
> > 
> > Since regardless of the option setting, the memory map is same, and
> > bootstrap page table only depend on the memory map, we use the same page
> > table when hanging and when operating correctly. We do not fault or hang
> > when the option is turned off, which together with the improved early
> > fault handling in the patch, makes it almost certain that the problem is
> > in hardware configuration and not in our early setup.
> > 
> > Of course, the most puzzling part is that memory test makes the hang
> > go away, while repeating memory test operation only on the msgbuf region
> > does not. msgbuf is special in that it is located at TOHM (top of high
> > memory). It spans 128KB from below it to the last byte of the last
> > physical segment.
> > 
> > The only ideas I have right now is that there is either a bug in the
> > Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> > is nothing OS can do to mitigate it.  Or it might be that the memory
> > map reported by CMS is wrong (you said that you use legacy boot, right
> > ?).  This is not too surprising if true, because non-EFI boot code path
> > definitely get less and less testing.
> > 
> > For the later case (potential bug in CMS), could you switch to EFI boot
> > mode and see whether the issue magically healths itself ?  You could boot
> > from USB stick in EFI mode without reinstalling for test.
> 
> I can't boot from USB stick -- this is remote DC and IPMI allow only
> CDROM emulation.
> 
> OK, I am boot in UEFI 12.0 snapshot ISO.
> Boot ok.
> 
> Can I convert installed OS to UEFI mode?
I am not sure what do you ask there.  Are you asking whether I need any
further information from the broken setup ?  I believe that no, I cannot
debug this any further.

I think that the interesting piece of data that can be obtained now is
the memmap command output from the EFI loader from all three configurations,
NUMA on/off and interleaving.

> 
> > Do you use latest BIOS for your motherboard ?
> 
> This is new MB (X10DRi) w/ BIOS 2.0, new is 2.1 but update is not
> simple (need to prepare bootable dos ISO, mostly utilites don't work
> under FreeBSD).
IMO the only way to fix this issue, if it is really important, is
to contact supermicro and show them the bug.  But this only makes sense if
repeated on the latest firmware version.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


FreeBSD_HEAD_i386 - Build #4383 - Failure

2016-12-14 Thread jenkins-admin
FreeBSD_HEAD_i386 - Build #4383 - Failure:

Build information: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4383/
Full change log: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4383/changes
Full build log: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_i386/4383/console

Change summaries:

310056 by ed:
Let all FEATURE()s use the same Prometheus metric.

Without this change, every individual FEATURE() declaration would have
an individual metric in Prometheus. Though this wouldn't be harmful, it
would look very cluttered.

By letting it use a single metric with the name of the feature attached
as a label, it also becomes easier to search, as you can apply regex
matching, etc.

Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310055 by ed:
Add a "device_index" label to all sysctls under dev.$driver.$index.

This way it becomes possible to graph a property for all instances of a
single driver. For example, graphing the number of packets across all
USB controllers, the amount of dropped packets on all NICs, etc.

Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310054 by ed:
Attach a "thermal_zone" label to the ACPI thermal zone sysctls.

In order to make Prometheus do graphing/alerting on thermal sensors in a
generic fashion, we should attach the name of the thermal zone device as
a label. That way there is only a single metric for the temperature of a
thermal zone, with its name attached as a label.

Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310053 by ed:
Add labels to sysctls related to clocks.

Sysctls like kern.eventtimer.et.*.quality currently embed the name of
the clock device. This is problematic for the Prometheus metrics
exporter for two reasons:

- Some of those clocks have dashes in their names, which Prometheus
  doesn't allow to be used in metric names.
- It doesn't allow for extracting the same property of all clocks on the
  system from within a single query.

Attach these nodes to have a label, so that the Prometheus metrics
exporter gives these metric a uniform name with the name of the clock
attached as a label.

Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310052 by ed:
Add label annotations to CAM sysctls.

Under kern.cam we have certain sysctls that are per-device, such as the
ones under kern.cam.ada.[0-9]+.*. Add a "device_index" label annotation
to such sysctls, so that the Prometheus metrics exporter will give all
of those metrics the same name. The device number will be added to the
metric name as the "device_index" label.

Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310051 by ed:
Add support for attaching aggregation labels to sysctl objects.

I'm currently working on writing a metrics exporter for the Prometheus
monitoring system to provide access to sysctl metrics. Prometheus and
sysctl have some structural differences:

- sysctl is a tree of string component names.
- Prometheus uses a flat namespace for its metrics, but allows you to
  attach labels with values to them, so that you can do aggregation.

An initial version of my exporter simply translated

hw.acpi.thermal.tz1.temperature

to

sysctl_hw_acpi_thermal_tz1_temperature_celcius

while we should ideally have

sysctl_hw_acpi_thermal_temperature_celcius{thermal_zone="tz1"}

allowing you to graph all thermal zones on a system in one go.

The change presented in this commit adds support for accomplishing this,
by providing the ability to attach labels to nodes. In the example I
gave above, the label "thermal_zone" would be attached to "tz1". As this
is a feature that will only be used very rarely, I decided to not change
the KPI too aggressively.

Discussed on:   hackers@
Reviewed by:cem
Differential Revision:  https://reviews.freebsd.org/D8775

310050 by kib:
Provide non-final but valid PCB pointer for thread0 for duration of
hammer_time().  This makes assembler exception handlers not fault
itself when setting PCB flags, and allow normal kernel trap handler to
get control.  The pointer is reset after FPU parameters are obtained.

Set thread0.td_critnest to 1 for duration of hammer_time() as well.
In particular, page faults at that early stage panic immediately
instead of trying to call not yet operational VM to resolve it.

As result, faults during second half of the hammer_time() execution
have a chance to be reported instead of silent machine reboot or hang.

Sponsored by:   The FreeBSD Foundation
MFC after:  2 weeks



The end of the build log:

[...truncated 149114 lines...]
--- all_subdir_apm ---
ctfconvert -L VERSION -g apm.o
--- apm.kld ---
ld -d -warn-common -r -d -o apm.kld apm.o
ctfmerge -L VERSION -g -o apm.kld apm.o
echo apm_display apm_softc > export_syms
awk -f /usr/src/sys/conf/kmod_syms.awk apm.kld  export_syms | xargs -J% objcopy 
% apm.kld
--- all_subdir_arcmsr ---
--- machine ---
machine -> 

FreeBSD_HEAD_amd64_gcc - Build #1732 - Still Failing

2016-12-14 Thread jenkins-admin
FreeBSD_HEAD_amd64_gcc - Build #1732 - Still Failing:

Build information: https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_amd64_gcc/1732/
Full change log: 
https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_amd64_gcc/1732/changes
Full build log: 
https://jenkins.FreeBSD.org/job/FreeBSD_HEAD_amd64_gcc/1732/console

Change summaries:

310049 by np:
cxgbe(4): Fix the tid range shown for T6 cards in misc.tids.

MFC after:  3 days

310048 by sephe:
hyperv: Implement "enlightened" time counter, which is rdtsc based.

Reviewed by:kib
MFC after:  1 week
Sponsored by:   Microsoft
Differential Revision:  https://reviews.freebsd.org/D8763

310047 by gjb:
- Resize FreeBSD to the size of the OpenStack flavor (growfs).
- Speeds up the boot process by disabling sendmail.
- Allows an user to ssh as root with a public key.
- Make ssh(1) respond faster by disabling DNS lookups.
- Enable DHCP on the vtnet(4) interface.

Note: The CLOUDWARE list has not yet been changed to include the
OpenStack target by default yet.

Submitted by:   Diego Casati
PR: 215258
MFC after:  1 week
Sponsored by:   The FreeBSD Foundation

310046 by jhb:
Add 'const' to fn_name's return type to remove a cast.

310045 by jhb:
Use casts to force an unsigned comparison in db_search_symbol().

On all of our platforms, db_expr_t is a signed integer while
db_addr_t is an unsigned integer value.  db_search_symbol used variables
of type db_expr_t to hold the current offset of the requested address from
the "best" symbol found so far.  This value was initialized to '~0'.
When a new symbol is found from a symbol table, the associated diff for the
new symbol is compared against the existing value as 'if (newdiff < diff)'
to determine if the new symbol had a smaller diff and was thus a closer
match.

On 64-bit MIPS, the '~0' was treated as a negative value (-1).  A lookup
that found a perfect match of an address against a symbol returned a diff
of 0.  However, in signed comparisons, 0 is not less than -1.  As a result,
DDB on 64-bit MIPS never resolved any addresses to symbols.  Workaround
this by using casts to force an unsigned comparison.

Probably the diff returned from db_search_symbol() and X_db_search_symbol()
should be changed to a db_addr_t instead of a db_expr_t as it is an
unsigned value (and is an offset of an address, so should fit in the same
size as an address).

Sponsored by:   DARPA / AFRL

310038 by dteske:
Revert r309918 -- modern POSIX has deprecated -<#>/+<#> syntax

Special thanks to:  jilles

310037 by jhb:
Fix stack traces in DDB for the debugger thread.

When the kernel debugger is entered, makectx() is called to store
appropriate state from the trapframe for the debugger into a global
kdb_pcb used as the thread context of the thread entering the
debugger.  Stack unwinders for DDB called via db_trace_thread() are
supposed to then use this saved context so that the stack trace for
the current thread starts at the location of the event that triggered
debugger entry.

MIPS was instead starting the stack trace of the current thread from
the context of db_trace_thread itself and unwinding back out through
the debugger to the original frame.  Fix a couple of things to bring
MIPS inline with other platforms:
- Fix makectx() to store the PC, SP, and RA in the right portion of
  the PCB used by db_trace_thread().
- Fix db_trace_thread() to always use kdb_thr_ctx() (and thus kdb_pcb
  for the debugger thread).
- Move the logic for tracing curthread from within the current
  function into db_trace_self() to match other architectures.

Sponsored by:   DARPA / AFRL

310036 by jkim:
MFV:r309561

Merge byacc 20161202.

310035 by hrs:
Remove an extra "break" which could incorrectly terminate an
STAILQ_FOREACH() loop when an AF_INET6 rule matched.

Spotted by: cem

310033 by np:
cxgbe(4): Retire t4_bus_space_read_8 and t4_bus_space_write_8.

MFC after:  3 days
Sponsored by:   Chelsio Communications

310032 by glebius:
Zero return value when counter_rate() switches over to next second and
value is positive, but below the limit.

310031 by cem:
linuxkpi: Fix not-found case of linux_pci_find_irq_dev

Linux list_for_each_entry() does not neccessarily end with the iterator
NULL (it may be an offset from NULL if the list member is not the first
element of the member struct).

Reported by:Coverity
CID:1366940
Reviewed by:hselasky@
Sponsored by:   Dell EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D8780

310030 by jhb:
Use register_t instead of uintptr_t for register values in backtraces.

This fixes backtraces from DDB in n32 kernels as uintptr_t is only a
uint32_t.  In particular, the upper 32-bits of each register value were
treated as the register's value breaking both the output of register
values, but also the values of 'ra' and 'sp' required to walk up to the
previous frame.

Sponsored by:   DARPA / AFRL

310029 by jhb:
Fix remove_userlocal_code() for n32.

n32 kernels use a 64-bit 

Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 01:39:27PM +0200, Konstantin Belousov wrote:

> In other words, it is almost certainly the hang and not a fault causing
> hang. This means that the machine is not compliant with the IA32
> architecture, in particular, the region reported as normal memory by
> E820 BIOS service does not behave as normal memory.
> 
> Since regardless of the option setting, the memory map is same, and
> bootstrap page table only depend on the memory map, we use the same page
> table when hanging and when operating correctly. We do not fault or hang
> when the option is turned off, which together with the improved early
> fault handling in the patch, makes it almost certain that the problem is
> in hardware configuration and not in our early setup.
> 
> Of course, the most puzzling part is that memory test makes the hang
> go away, while repeating memory test operation only on the msgbuf region
> does not. msgbuf is special in that it is located at TOHM (top of high
> memory). It spans 128KB from below it to the last byte of the last
> physical segment.
> 
> The only ideas I have right now is that there is either a bug in the
> Caching Agent/Home agent/IMC configuration in BIOS, in which case there
> is nothing OS can do to mitigate it.  Or it might be that the memory
> map reported by CMS is wrong (you said that you use legacy boot, right
> ?).  This is not too surprising if true, because non-EFI boot code path
> definitely get less and less testing.
> 
> For the later case (potential bug in CMS), could you switch to EFI boot
> mode and see whether the issue magically healths itself ?  You could boot
> from USB stick in EFI mode without reinstalling for test.

I can't boot from USB stick -- this is remote DC and IPMI allow only
CDROM emulation.

OK, I am boot in UEFI 12.0 snapshot ISO.
Boot ok.

Can I convert installed OS to UEFI mode?

> Do you use latest BIOS for your motherboard ?

This is new MB (X10DRi) w/ BIOS 2.0, new is 2.1 but update is not
simple (need to prepare bootable dos ISO, mostly utilites don't work
under FreeBSD).
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Wed, Dec 14, 2016 at 01:52:11PM +0300, Slawa Olhovchenkov wrote:
> Booting...
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> SMAP type=01 base= len=00099c00
> SMAP type=02 base=00099c00 len=6400
> SMAP type=02 base=000e len=0002
> SMAP type=01 base=0010 len=7906b000
> SMAP type=02 base=7916b000 len=00936000
> SMAP type=04 base=79aa1000 len=00509000
> SMAP type=02 base=79faa000 len=02056000
> SMAP type=01 base=0001 len=001f8000
> SMAP type=02 base=7c00 len=1400
> SMAP type=02 base=fed1c000 len=00029000
> SMAP type=02 base=ff00 len=0100
> TTT1 0xf8207ff0 0xf8207fb8 10
> . 0
> . 1000
> . 2000
> . 3000
> . 4000
> . 5000
> . 6000
> . 7000
> . 8000
> . 9000
> . a000
> . b000
> . c000
> . d000
> . e000
> . f000
> . 1
> . 11000
> . 12000
> . 13000
> . 14000
> . 15000
> . 16000
> . 17000
> . 18000
> . 19000
> . 1a000
> . 1b000
> . 1c000
> . 1d000
> . 1e000
> . 1f000
> . 2
> . 21000
> . 22000
> . 23000
> . 24000
> . 25000
> . 26000
> . 27000
> . 28000
> . 29000
> . 2a000
> . 2b000

In other words, it is almost certainly the hang and not a fault causing
hang. This means that the machine is not compliant with the IA32
architecture, in particular, the region reported as normal memory by
E820 BIOS service does not behave as normal memory.

Since regardless of the option setting, the memory map is same, and
bootstrap page table only depend on the memory map, we use the same page
table when hanging and when operating correctly. We do not fault or hang
when the option is turned off, which together with the improved early
fault handling in the patch, makes it almost certain that the problem is
in hardware configuration and not in our early setup.

Of course, the most puzzling part is that memory test makes the hang
go away, while repeating memory test operation only on the msgbuf region
does not. msgbuf is special in that it is located at TOHM (top of high
memory). It spans 128KB from below it to the last byte of the last
physical segment.

The only ideas I have right now is that there is either a bug in the
Caching Agent/Home agent/IMC configuration in BIOS, in which case there
is nothing OS can do to mitigate it.  Or it might be that the memory
map reported by CMS is wrong (you said that you use legacy boot, right
?).  This is not too surprising if true, because non-EFI boot code path
definitely get less and less testing.

For the later case (potential bug in CMS), could you switch to EFI boot
mode and see whether the issue magically healths itself ?  You could boot
from USB stick in EFI mode without reinstalling for test.

Do you use latest BIOS for your motherboard ?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Slawa Olhovchenkov
On Wed, Dec 14, 2016 at 12:27:11PM +0200, Konstantin Belousov wrote:

> On Wed, Dec 14, 2016 at 11:53:50AM +0200, Konstantin Belousov wrote:
> > On Tue, Dec 13, 2016 at 08:43:45PM +0300, Slawa Olhovchenkov wrote:
> > > On Tue, Dec 13, 2016 at 07:25:29PM +0200, Konstantin Belousov wrote:
> > > 
> > > > This is not what I expected.
> > > > Also, I realized that I mis-read the memory test code.  It does not
> > > > obliterate memory, old content is preserved.
> > > > 
> > > > Please do exactly the same testing with another patch, at the end of the
> > > > message.  There could be more output, up to 256 lines.
> > > 
> > > No problem.
> > > 
> > > Booting...
> > > KDB: debugger backends: ddb
> > > KDB: current backend: ddb
> > > SMAP type=01 base= len=00099c00
> > > SMAP type=02 base=00099c00 len=6400
> > > SMAP type=02 base=000e len=0002
> > > SMAP type=01 base=0010 len=7906b000
> > > SMAP type=02 base=7916b000 len=00936000
> > > SMAP type=04 base=79aa1000 len=00509000
> > > SMAP type=02 base=79faa000 len=02056000
> > > SMAP type=01 base=0001 len=001f8000
> > > SMAP type=02 base=7c00 len=1400
> > > SMAP type=02 base=fed1c000 len=00029000
> > > SMAP type=02 base=ff00 len=0100
> > > TTT1 0xf8207ff0 0xf8207fb8 10
> > > . 0
> > > . 1000
> > > . 2000
> > > . 3000
> > > . 4000
> > > . 5000
> > > . 6000
> > > . 7000
> > > . 8000
> > > . 9000
> > > . a000
> > > . b000
> > > . c000
> > > . d000
> > > . e000
> > > . f000
> > > . 1
> > > . 11000
> > > . 12000
> > > . 13000
> > > . 14000
> > > . 15000
> > > . 16000
> > > . 17000
> > > . 18000
> > > . 19000
> > > . 1a000
> > > . 1b000
> > > . 1c000
> > > . 1d000
> > > . 1e000
> > > . 1f000
> > > . 2
> > > . 21000
> > > . 22000
> > > . 23000
> > > . 24000
> > > . 25000
> > > . 26000
> > > . 27000
> > > . 28000
> > > . 29000
> > > . 2a000
> > > . 2b000
> > > 
> > 
> > Do you still have access to the machine ?
> > If yes, please try this patch (against clean tree, as always) with the
> > same instructions as before.
> > 
> 
> Updated patch, it should provide the expected information in case of
> page fault.

Booting...
KDB: debugger backends: ddb
KDB: current backend: ddb
SMAP type=01 base= len=00099c00
SMAP type=02 base=00099c00 len=6400
SMAP type=02 base=000e len=0002
SMAP type=01 base=0010 len=7906b000
SMAP type=02 base=7916b000 len=00936000
SMAP type=04 base=79aa1000 len=00509000
SMAP type=02 base=79faa000 len=02056000
SMAP type=01 base=0001 len=001f8000
SMAP type=02 base=7c00 len=1400
SMAP type=02 base=fed1c000 len=00029000
SMAP type=02 base=ff00 len=0100
TTT1 0xf8207ff0 0xf8207fb8 10
. 0
. 1000
. 2000
. 3000
. 4000
. 5000
. 6000
. 7000
. 8000
. 9000
. a000
. b000
. c000
. d000
. e000
. f000
. 1
. 11000
. 12000
. 13000
. 14000
. 15000
. 16000
. 17000
. 18000
. 19000
. 1a000
. 1b000
. 1c000
. 1d000
. 1e000
. 1f000
. 2
. 21000
. 22000
. 23000
. 24000
. 25000
. 26000
. 27000
. 28000
. 29000
. 2a000
. 2b000

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Wed, Dec 14, 2016 at 11:53:50AM +0200, Konstantin Belousov wrote:
> On Tue, Dec 13, 2016 at 08:43:45PM +0300, Slawa Olhovchenkov wrote:
> > On Tue, Dec 13, 2016 at 07:25:29PM +0200, Konstantin Belousov wrote:
> > 
> > > This is not what I expected.
> > > Also, I realized that I mis-read the memory test code.  It does not
> > > obliterate memory, old content is preserved.
> > > 
> > > Please do exactly the same testing with another patch, at the end of the
> > > message.  There could be more output, up to 256 lines.
> > 
> > No problem.
> > 
> > Booting...
> > KDB: debugger backends: ddb
> > KDB: current backend: ddb
> > SMAP type=01 base= len=00099c00
> > SMAP type=02 base=00099c00 len=6400
> > SMAP type=02 base=000e len=0002
> > SMAP type=01 base=0010 len=7906b000
> > SMAP type=02 base=7916b000 len=00936000
> > SMAP type=04 base=79aa1000 len=00509000
> > SMAP type=02 base=79faa000 len=02056000
> > SMAP type=01 base=0001 len=001f8000
> > SMAP type=02 base=7c00 len=1400
> > SMAP type=02 base=fed1c000 len=00029000
> > SMAP type=02 base=ff00 len=0100
> > TTT1 0xf8207ff0 0xf8207fb8 10
> > . 0
> > . 1000
> > . 2000
> > . 3000
> > . 4000
> > . 5000
> > . 6000
> > . 7000
> > . 8000
> > . 9000
> > . a000
> > . b000
> > . c000
> > . d000
> > . e000
> > . f000
> > . 1
> > . 11000
> > . 12000
> > . 13000
> > . 14000
> > . 15000
> > . 16000
> > . 17000
> > . 18000
> > . 19000
> > . 1a000
> > . 1b000
> > . 1c000
> > . 1d000
> > . 1e000
> > . 1f000
> > . 2
> > . 21000
> > . 22000
> > . 23000
> > . 24000
> > . 25000
> > . 26000
> > . 27000
> > . 28000
> > . 29000
> > . 2a000
> > . 2b000
> > 
> 
> Do you still have access to the machine ?
> If yes, please try this patch (against clean tree, as always) with the
> same instructions as before.
> 

Updated patch, it should provide the expected information in case of
page fault.

diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c
index b2283339405..682307f5fe4 100644
--- a/sys/amd64/amd64/machdep.c
+++ b/sys/amd64/amd64/machdep.c
@@ -1673,6 +1673,16 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
wrmsr(MSR_SF_MASK, PSL_NT|PSL_T|PSL_I|PSL_C|PSL_D);
 
/*
+* Temporary forge some valid pointer to PCB, for exception
+* handlers.  It is reinitialized properly below after FPU is
+* set up.  Also set up td_critnest to short-cut the page
+* fault handler.
+*/
+   cpu_max_ext_state_size = sizeof(struct savefpu);
+   thread0.td_pcb = get_pcb_td();
+   thread0.td_critnest = 1;
+
+   /*
 * The console and kdb should be initialized even earlier than here,
 * but some console drivers don't work until after getmemsize().
 * Default to late console initialization to support these drivers.
@@ -1762,6 +1772,7 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
 #ifdef FDT
x86_init_fdt();
 #endif
+   thread0.td_critnest = 0;
 
/* Location of kernel stack for locore */
return ((u_int64_t)thread0.td_pcb);
diff --git a/sys/kern/subr_msgbuf.c b/sys/kern/subr_msgbuf.c
index f275aef3b4f..1be7a629f65 100644
--- a/sys/kern/subr_msgbuf.c
+++ b/sys/kern/subr_msgbuf.c
@@ -67,14 +67,19 @@ msgbuf_init(struct msgbuf *mbp, void *ptr, int size)
mbp->msg_ptr = ptr;
mbp->msg_size = size;
mbp->msg_seqmod = SEQMOD(size);
+printf("YYY1\n");
msgbuf_clear(mbp);
+printf("YYY2\n");
mbp->msg_magic = MSG_MAGIC;
mbp->msg_lastpri = -1;
mbp->msg_flags = 0;
+printf("YYY3\n");
bzero(>msg_lock, sizeof(mbp->msg_lock));
mtx_init(>msg_lock, "msgbuf", NULL, MTX_SPIN);
+printf("YYY4\n");
 }
 
+
 /*
  * Reinitialize a message buffer, retaining its previous contents if
  * the size and checksum are correct. If the old contents cannot be
@@ -85,8 +90,10 @@ msgbuf_reinit(struct msgbuf *mbp, void *ptr, int size)
 {
u_int cksum;
 
-   if (mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
+   if (1 || mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
+printf("XXX1\n");
msgbuf_init(mbp, ptr, size);
+printf("XXX2\n");
return;
}
mbp->msg_seqmod = SEQMOD(size);
@@ -117,10 +124,12 @@ void
 msgbuf_clear(struct msgbuf *mbp)
 {
 
+printf("ZZZ1\n");
bzero(mbp->msg_ptr, mbp->msg_size);
mbp->msg_wseq = 0;
mbp->msg_rseq = 0;
mbp->msg_cksum = 0;
+printf("ZZZ2\n");
 }
 
 /*
diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
index e78863830c7..a72984dbc19 100644
--- a/sys/kern/subr_prf.c
+++ b/sys/kern/subr_prf.c
@@ -998,6 +998,14 @@ msgbufinit(void *ptr, int size)
char *cp;
static struct msgbuf *oldp = NULL;
 
+printf("TTT1 %p %p %x\n", ptr, (char *)ptr + size - 

Re: Enabling NUMA in BIOS stop booting FreeBSD

2016-12-14 Thread Konstantin Belousov
On Tue, Dec 13, 2016 at 08:43:45PM +0300, Slawa Olhovchenkov wrote:
> On Tue, Dec 13, 2016 at 07:25:29PM +0200, Konstantin Belousov wrote:
> 
> > This is not what I expected.
> > Also, I realized that I mis-read the memory test code.  It does not
> > obliterate memory, old content is preserved.
> > 
> > Please do exactly the same testing with another patch, at the end of the
> > message.  There could be more output, up to 256 lines.
> 
> No problem.
> 
> Booting...
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> SMAP type=01 base= len=00099c00
> SMAP type=02 base=00099c00 len=6400
> SMAP type=02 base=000e len=0002
> SMAP type=01 base=0010 len=7906b000
> SMAP type=02 base=7916b000 len=00936000
> SMAP type=04 base=79aa1000 len=00509000
> SMAP type=02 base=79faa000 len=02056000
> SMAP type=01 base=0001 len=001f8000
> SMAP type=02 base=7c00 len=1400
> SMAP type=02 base=fed1c000 len=00029000
> SMAP type=02 base=ff00 len=0100
> TTT1 0xf8207ff0 0xf8207fb8 10
> . 0
> . 1000
> . 2000
> . 3000
> . 4000
> . 5000
> . 6000
> . 7000
> . 8000
> . 9000
> . a000
> . b000
> . c000
> . d000
> . e000
> . f000
> . 1
> . 11000
> . 12000
> . 13000
> . 14000
> . 15000
> . 16000
> . 17000
> . 18000
> . 19000
> . 1a000
> . 1b000
> . 1c000
> . 1d000
> . 1e000
> . 1f000
> . 2
> . 21000
> . 22000
> . 23000
> . 24000
> . 25000
> . 26000
> . 27000
> . 28000
> . 29000
> . 2a000
> . 2b000
> 

Do you still have access to the machine ?
If yes, please try this patch (against clean tree, as always) with the
same instructions as before.

diff --git a/sys/amd64/amd64/machdep.c b/sys/amd64/amd64/machdep.c
index b2283339405..917ea4475f3 100644
--- a/sys/amd64/amd64/machdep.c
+++ b/sys/amd64/amd64/machdep.c
@@ -1673,6 +1673,14 @@ hammer_time(u_int64_t modulep, u_int64_t physfree)
wrmsr(MSR_SF_MASK, PSL_NT|PSL_T|PSL_I|PSL_C|PSL_D);
 
/*
+* Temporary forge some valid pointer to PCB, for exception
+* handlers.  It is reinitialized properly below after FPU is
+* set up.
+*/
+   cpu_max_ext_state_size = sizeof(struct savefpu);
+   thread0.td_pcb = get_pcb_td();
+
+   /*
 * The console and kdb should be initialized even earlier than here,
 * but some console drivers don't work until after getmemsize().
 * Default to late console initialization to support these drivers.
diff --git a/sys/kern/subr_msgbuf.c b/sys/kern/subr_msgbuf.c
index f275aef3b4f..1be7a629f65 100644
--- a/sys/kern/subr_msgbuf.c
+++ b/sys/kern/subr_msgbuf.c
@@ -67,14 +67,19 @@ msgbuf_init(struct msgbuf *mbp, void *ptr, int size)
mbp->msg_ptr = ptr;
mbp->msg_size = size;
mbp->msg_seqmod = SEQMOD(size);
+printf("YYY1\n");
msgbuf_clear(mbp);
+printf("YYY2\n");
mbp->msg_magic = MSG_MAGIC;
mbp->msg_lastpri = -1;
mbp->msg_flags = 0;
+printf("YYY3\n");
bzero(>msg_lock, sizeof(mbp->msg_lock));
mtx_init(>msg_lock, "msgbuf", NULL, MTX_SPIN);
+printf("YYY4\n");
 }
 
+
 /*
  * Reinitialize a message buffer, retaining its previous contents if
  * the size and checksum are correct. If the old contents cannot be
@@ -85,8 +90,10 @@ msgbuf_reinit(struct msgbuf *mbp, void *ptr, int size)
 {
u_int cksum;
 
-   if (mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
+   if (1 || mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
+printf("XXX1\n");
msgbuf_init(mbp, ptr, size);
+printf("XXX2\n");
return;
}
mbp->msg_seqmod = SEQMOD(size);
@@ -117,10 +124,12 @@ void
 msgbuf_clear(struct msgbuf *mbp)
 {
 
+printf("ZZZ1\n");
bzero(mbp->msg_ptr, mbp->msg_size);
mbp->msg_wseq = 0;
mbp->msg_rseq = 0;
mbp->msg_cksum = 0;
+printf("ZZZ2\n");
 }
 
 /*
diff --git a/sys/kern/subr_prf.c b/sys/kern/subr_prf.c
index e78863830c7..a72984dbc19 100644
--- a/sys/kern/subr_prf.c
+++ b/sys/kern/subr_prf.c
@@ -998,6 +998,14 @@ msgbufinit(void *ptr, int size)
char *cp;
static struct msgbuf *oldp = NULL;
 
+printf("TTT1 %p %p %x\n", ptr, (char *)ptr + size - sizeof(*msgbufp), size);
+for (int i = 0; i < size; i++) {
+if (i % PAGE_SIZE == 0) printf(". %x\n", i);
+   volatile char *c = (char *)ptr + i;
+   char tmp;
+   tmp = *c;
+   *c = tmp;
+}
size -= sizeof(*msgbufp);
cp = (char *)ptr;
msgbufp = (struct msgbuf *)(cp + size);
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Help

2016-12-14 Thread Hans Petter Selasky

On 12/13/16 05:08, Lewis ingraham wrote:

4. Another potential problem with usb drive detection as well. Usb ports
work just fine in something like windows and linux but not FreeBSD.


Can you show dmesg of failed enumerations? Did you try to set some 
device quirks?


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"