Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation

2020-10-20 Thread Michael Ellerman
On Tue, 13 Oct 2020 15:37:40 +1100, Michael Neuling wrote:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
> 
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load 
emulation
  https://git.kernel.org/powerpc/c/1da4a0272c5469169f78cd76cf175ff984f52f06
[2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load 
workaround
  https://git.kernel.org/powerpc/c/d1781f23704707d350b8c9006e2bdf5394bf91b2

cheers


Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation

2020-10-13 Thread Michael Ellerman
Michael Neuling  writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Below is a script which takes a System.map and vmlinux (or objdump
output) and tries to check if the stack layout is susceptible to the
bug.

cheers



#!/usr/bin/python3

import os
import sys
import re
from subprocess import Popen, PIPE


# eg: c002ea88:   ce 49 00 7c stvxv0,0,r9
stvx_pattern = re.compile('^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} 
){4}\s+stvx\s+v0,0,(r\d+)\s*')

# eg: c002ea80:   28 00 21 39 addir9,r1,40
addi_pattern = '^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} ){4}\s+addi\s+%s,r1,(\d+)\s*'


def main(args):
if len(args) != 2:
print('Usage: %s  ' % sys.argv[0])
return -1

if os.path.basename(sys.argv[1]).startswith('vmlinu'):
dump = Popen(['objdump', '-d', sys.argv[1]], stdout=PIPE, 
encoding='utf-8').stdout
else:
dump = open(sys.argv[1])

syms = read_symbols(sys.argv[2])

func_lines = extract_func(dump, 'handle_hmi_exception', syms)
if func_lines is None:
print("Error: couldn't find handle_hmi_exception in objdump output")
return -1

match = None
i = 0
while i < len(func_lines):
match = stvx_pattern.match(func_lines[i])
if match:
break
i += 1

if match is None:
print("Error: couldn't find stvx in handle_hmi_exception")
return -1

stvx_reg = match.group(1)
print('stvx found using register %s:\n%s\n' % (stvx_reg, 
match.group(0).rstrip()))

match = None
i -= 1
while i > 0:
pattern = re.compile(addi_pattern % stvx_reg)
match = pattern.match(func_lines[i])
if match:
break
i -= 1

if match is None:
print("Error: couldn't find addi in handle_hmi_exception")
return -1

stack_offset = int(match.group(1))
print('addi found using offset %d:\n%s\n' % (stack_offset, 
match.group(0).rstrip()))

if stack_offset & 0xf:
print('')
print('!! Offset is misaligned - bug present !!')
print('')
return 1
else:
print('OK - offset is aligned')

return 0


def extract_func(f, func_name, syms):
func_addr, func_size = find_symbol_and_size(syms, func_name)
num_lines = int(func_size / 4)

pattern = re.compile('^%016x:' % func_addr)

match = None
line = f.readline()
while len(line):
match = pattern.match(line)
if match:
break
line = f.readline()

if match is None:
return None

lines = []
for i in range(0, num_lines):
lines.append(f.readline())

return lines


def read_symbols(map_path):
last_function = ''
last_addr = 0

lines = open(map_path).readlines()

addrs = []
last_addr = 0
for line in lines:
tokens = line.split()
if len(tokens) == 3:
addr = int(tokens[0], 16)
sym_type = tokens[1]
name = tokens[2]
elif len(tokens) == 2:
addr = last_addr
sym_type = tokens[0]
name = tokens[1]
else:
raise Exception("Couldn't grok System.map")

addrs.append((addr, name, sym_type))
last_addr = addr

return addrs


def find_symbol_and_size(symbol_map, name):
dot_name = '.%s' % name
saddr = None
i = 0
for addr, cur_name, sym_type in symbol_map:
if cur_name == name or cur_name == dot_name:
saddr = addr
break
i += 1

if saddr is None:
return (None, None)

i += 1
if i >= len(symbol_map):
size = -1
else:
size = symbol_map[i][0] - saddr

return (saddr, size)


sys.exit(main(sys.argv[1:]))


Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation

2020-10-13 Thread Michael Ellerman
Michael Neuling  writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Actually I'm yet to find a kernel with CONFIG_STACKPROTECTOR=n that is
vulnerable to the bug.

Turning on STACKPROTECTOR changes the order GCC allocates locals on the
stack, from bottom-up to top-down. That in conjunction with the 8 byte
stack canary means we end up with 8 bytes of space below the locals,
which misaligns vbuf.

But obviously other things can change the stack layout too, so no
guarantees that CONFIG_STACKPROTECTOR=n makes it safe.

cheers


[PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation

2020-10-12 Thread Michael Neuling
__get_user_atomic_128_aligned() stores to kaddr using stvx which is a
VMX store instruction, hence kaddr must be 16 byte aligned otherwise
the store won't occur as expected.

Unfortunately when we call __get_user_atomic_128_aligned() in
p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
guaranteed to be 16B aligned. This means that the write to vbuf in
__get_user_atomic_128_aligned() has the bottom bits of the address
truncated. This results in other local variables being
overwritten. Also vbuf will not contain the correct data which results
in the userspace emulation being wrong and hence user data corruption.

In the past we've been mostly lucky as vbuf has ended up aligned but
this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
particular can change the stack arrangement enough that our luck runs
out.

This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal.

The fix is to align vbuf to a 16 byte boundary.

Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue")
Signed-off-by: Michael Neuling 
Cc:  # v4.15+
---
 arch/powerpc/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index c5f39f13e96e..5006dcbe1d9f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -885,7 +885,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
 {
unsigned int ra, rb, t, i, sel, instr, rc;
const void __user *addr;
-   u8 vbuf[16], *vdst;
+   u8 vbuf[16] __aligned(16), *vdst;
unsigned long ea, msr, msr_mask;
bool swap;
 
-- 
2.26.2