Re: rng padlock changes causes NetBSD to crash

2017-02-16 Thread Andrius V
Yes, I've read that padlock design is not elegant at all and true, I
don't think so that we will see it redesigned anymore... Though I
didn't expect that it might be any difference between i386 and amd64
kernels but after your explanation it makes sense. I have a bit more
limited access to nano devices so I just used i386 kernel to all of
them when I had a chance. Thank you for the fix.

On Thu, Feb 16, 2017 at 5:04 PM, Thor Lancelot Simon  wrote:
> On Thu, Feb 16, 2017 at 01:04:33PM +0200, Andrius V wrote:
>> Hi,
>>
>> I have tested the fix.  lcr4(rcr4() | CR4_OSFXSR); helps indeed and
>> system boots but if statement seems to be not correct, at least on
>> VT-310DP board it ended up in the same error.
>
> I checked in an unconditional version of the fix.
>
> It's interesting to note (again as pointed out by jak@) 64 bit
> kernels would not have had this problem, since they enable SSE
> very early in CPU startup.
>
> The underlying hardware cause for this seems to also be why we
> must mess with the coprocessor enables in CR0 before calling
> any ACE/RNG instructions -- in Jonathan's testing, even a single
> call to xstorerng causes a coprocessor (DNA) fault if the FPU's
> off.  Unfortunate, since all that monkeying around (interrupt
> disable, turn off preemption, whack coproc regs) increases the
> cost of getting 8 random bytes by a factor of at least 10.
>
> The VIA manual suggests they intended to move PadLock out of
> the FPU in later designs but I think at this point it is
> fairly clear there won't be any of those.
>
> Thor


Re: rng padlock changes causes NetBSD to crash

2017-02-16 Thread Thor Lancelot Simon
On Thu, Feb 16, 2017 at 01:04:33PM +0200, Andrius V wrote:
> Hi,
> 
> I have tested the fix.  lcr4(rcr4() | CR4_OSFXSR); helps indeed and
> system boots but if statement seems to be not correct, at least on
> VT-310DP board it ended up in the same error.

I checked in an unconditional version of the fix.

It's interesting to note (again as pointed out by jak@) 64 bit
kernels would not have had this problem, since they enable SSE
very early in CPU startup.

The underlying hardware cause for this seems to also be why we
must mess with the coprocessor enables in CR0 before calling
any ACE/RNG instructions -- in Jonathan's testing, even a single
call to xstorerng causes a coprocessor (DNA) fault if the FPU's
off.  Unfortunate, since all that monkeying around (interrupt
disable, turn off preemption, whack coproc regs) increases the
cost of getting 8 random bytes by a factor of at least 10.

The VIA manual suggests they intended to move PadLock out of
the FPU in later designs but I think at this point it is
fairly clear there won't be any of those.

Thor


Re: rng padlock changes causes NetBSD to crash

2017-02-16 Thread Andrius V
Hi,

I have tested the fix.  lcr4(rcr4() | CR4_OSFXSR); helps indeed and
system boots but if statement seems to be not correct, at least on
VT-310DP board it ended up in the same error.

On Wed, Feb 15, 2017 at 6:35 PM, Thor Lancelot Simon  wrote:
> On Wed, Feb 15, 2017 at 10:25:36AM +0200, Andrius V wrote:
>>
>> Crash looks like this:
>>
>> cpu_rng: VIA
>> fatal privileged instruction fault in supervisor mode
>> trap type 0 code 0 eip c0132377 cs 8 eflags 10046 cr2 0 ilevel 8 esp 80050033
>> curlwp 0xc12427e0 pid 0 lid 1 lowest kstack 0xc14d82c0
>> kernel: supervisor trap privileged instruction fault, code=0
>> Stopped in pid 0.1 (system) at netbsd:cpu_rng+0x4a: xstore-rng.
>
> Jonathan Kollasch found the bug: cpu_rng is called before the FPU
> detection code, which would normally enable SSE.  SSE must be enabled
> the PadLock instructions to actually work, regardless of what the MSR
> values say.
>
> Can you test the following patch, before I commit it?
>
>
> Index: identcpu.c
> ===
> RCS file: /cvsroot/src/sys/arch/x86/x86/identcpu.c,v
> retrieving revision 1.52
> diff -u -p -r1.52 identcpu.c
> --- identcpu.c  2 Feb 2017 08:57:04 -   1.52
> +++ identcpu.c  15 Feb 2017 16:32:15 -
> @@ -551,7 +551,20 @@ cpu_probe_c3(struct cpu_info *ci)
> }
> }
>
> -   /* Actually do the enables. */
> +   /*
> +* Actually do the enables.  It's a little gross,
> +* but per the PadLock programming guide, "Enabling
> +* PadLock", condition 3, we must enable SSE too or
> +* else the first use of RNG or ACE instructions
> +* will generate a trap.
> +*
> +* We must do this early because of kernel RNG
> +* initialization but it is safe without the full
> +* FPU-detect as all these CPUs have SSE.
> +*/
> +   if (cpu_feature[0] & CPUID_FXSR)
> +   lcr4(rcr4() | CR4_OSFXSR);
> +
> if (rng_enable) {
> msr = rdmsr(MSR_VIA_RNG);
> msr |= MSR_VIA_RNG_ENABLE;


Re: rng padlock changes causes NetBSD to crash

2017-02-15 Thread Thor Lancelot Simon
On Wed, Feb 15, 2017 at 10:25:36AM +0200, Andrius V wrote:
> Hello,
> 
> I have recently decided to test changes in this commit
> https://mail-archive.com/source-changes@netbsd.org/msg64898.html.
> Unfortunately NetBSD (i386) crashes on boot in all systems I have
> tried with which includes VIA VT-310DP (two C5P based Eden-N 1GHz
> CPUs), EPIA-M900 (Nano X2 1.6GHz), Jetway JNF76 (Nano U2300) in the
> same fashion. I have a question if these changes have been ever tested
> on real hardware and does it work for any of you? Should I make a new
> bug report for this?

Despite repeated requests, nobody ever came forward with real hardware
to test on, except a few people who turned out to have very old chips
that lacked the RNG.

It'd be nice to have the output of DDB "trace" and "regs", but maybe I
can find the issue by code inspection -- if you crashed _at_ the
XSTORERNG instruction odds are I flipped two of the arguments to the
asm statement or something.

What happens if you boot an amd64 kernel on the Nano machines?

Thor


rng padlock changes causes NetBSD to crash

2017-02-15 Thread Andrius V
Hello,

I have recently decided to test changes in this commit
https://mail-archive.com/source-changes@netbsd.org/msg64898.html.
Unfortunately NetBSD (i386) crashes on boot in all systems I have
tried with which includes VIA VT-310DP (two C5P based Eden-N 1GHz
CPUs), EPIA-M900 (Nano X2 1.6GHz), Jetway JNF76 (Nano U2300) in the
same fashion. I have a question if these changes have been ever tested
on real hardware and does it work for any of you? Should I make a new
bug report for this?

Crash looks like this:

cpu_rng: VIA
fatal privileged instruction fault in supervisor mode
trap type 0 code 0 eip c0132377 cs 8 eflags 10046 cr2 0 ilevel 8 esp 80050033
curlwp 0xc12427e0 pid 0 lid 1 lowest kstack 0xc14d82c0
kernel: supervisor trap privileged instruction fault, code=0
Stopped in pid 0.1 (system) at netbsd:cpu_rng+0x4a: xstore-rng.

Regards,
Andrius Varanavicius