Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-26 Thread Michael Ellerman
Larry Finger  writes:

> On 06/23/2017 03:29 PM, Al Viro wrote:
>> On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:
>> 
 BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
 bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
 __copy_from_user()
 originally) had always been dubious and the things are simpler without 
 them.
 If _that_ turns out to cure breakage, I would be very surprised, though.

>>> Sorry I was gone so long. Installing jessie on this box resulted in a crash
>>> on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
>>> nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
>>> Unity, but I guess I'm stuck for now.
>> 
>> Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
>> different...  Ubuntu 12.04 is what, 3.2?
>> 
>>> I know how easy it is to screw up a long bisection by booting the wrong
>>> kernel. To help that problem and to work around the yaconf/yboot nonsense on
>>> the MAC, my /etc/yaconf has always had generic kernel stanzas with only
>>> default, old, and original kernels mentioned. From there I use a local
>>> script to finish a kernel installation by moving the default links to the
>>> old ones and creating the new default links pointing to the current kernel.
>>> With those long-tested scripts, I'm sure that I am booting the one I want.
>>>
>>> With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
>>> backported 46f401c4 added.
>>>
>>> Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
>>> effect.
>> 
>> OK, that simplifies things a bit.  Just to make sure we are on the same page:
>> 
>> * f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
>> * 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
>>with removal of constant-size bits in raw_copy_..._user().  Failure 
>> appears
>>to be on udev getting EFAULT on some syscalls.
>> * straight Ubuntu 12.04 works
>> * jessie crashes on boot.
>
> I made a break through. If I turn off inline copy to/from users for 32-bit 
> ppc 
> with the following patch, then the system boots:
>
> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index 5c0d8a8cdae5..1e6a8723f497 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -267,12 +267,7 @@ do { 
> \
>   extern unsigned long __copy_tofrom_user(void __user *to,
>  const void __user *from, unsigned long size);
>
> -#ifndef __powerpc64__
> -
> -#define INLINE_COPY_FROM_USER
> -#define INLINE_COPY_TO_USE
> -
> -#else /* __powerpc64__ */
> +#ifdef __powerpc64__
>
>   static inline unsigned long
>   raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)

Thanks for debugging this.

I just sent a fix based on the above. Let me know if it doesn't work for
you.

cheers


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-26 Thread Michael Ellerman
Larry Finger  writes:

> On 06/23/2017 03:29 PM, Al Viro wrote:
>> On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:
>> 
 BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
 bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
 __copy_from_user()
 originally) had always been dubious and the things are simpler without 
 them.
 If _that_ turns out to cure breakage, I would be very surprised, though.

>>> Sorry I was gone so long. Installing jessie on this box resulted in a crash
>>> on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
>>> nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
>>> Unity, but I guess I'm stuck for now.
>> 
>> Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
>> different...  Ubuntu 12.04 is what, 3.2?
>> 
>>> I know how easy it is to screw up a long bisection by booting the wrong
>>> kernel. To help that problem and to work around the yaconf/yboot nonsense on
>>> the MAC, my /etc/yaconf has always had generic kernel stanzas with only
>>> default, old, and original kernels mentioned. From there I use a local
>>> script to finish a kernel installation by moving the default links to the
>>> old ones and creating the new default links pointing to the current kernel.
>>> With those long-tested scripts, I'm sure that I am booting the one I want.
>>>
>>> With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
>>> backported 46f401c4 added.
>>>
>>> Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
>>> effect.
>> 
>> OK, that simplifies things a bit.  Just to make sure we are on the same page:
>> 
>> * f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
>> * 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
>>with removal of constant-size bits in raw_copy_..._user().  Failure 
>> appears
>>to be on udev getting EFAULT on some syscalls.
>> * straight Ubuntu 12.04 works
>> * jessie crashes on boot.
>
> I made a break through. If I turn off inline copy to/from users for 32-bit 
> ppc 
> with the following patch, then the system boots:
>
> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index 5c0d8a8cdae5..1e6a8723f497 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -267,12 +267,7 @@ do { 
> \
>   extern unsigned long __copy_tofrom_user(void __user *to,
>  const void __user *from, unsigned long size);
>
> -#ifndef __powerpc64__
> -
> -#define INLINE_COPY_FROM_USER
> -#define INLINE_COPY_TO_USE
> -
> -#else /* __powerpc64__ */
> +#ifdef __powerpc64__
>
>   static inline unsigned long
>   raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)

Thanks for debugging this.

I just sent a fix based on the above. Let me know if it doesn't work for
you.

cheers


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-26 Thread Michael Ellerman
Al Viro  writes:

> On Sun, Jun 25, 2017 at 04:44:09PM -0500, Segher Boessenkool wrote:
>
>> Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
>> the actual problem may still exist in more recent compilers (if it _is_
>> a compiler problem; if it's not, you *really* want to know :-) )
>
> Enjoy.  At least 6.3 doesn't step into that.  Look for mtctr in the resulting
> asm...
>
> cat <<'EOF' >a.c
...

I pointed creduce at that and got the version below, which I'm pretty
sure still exhibits the weird mtctr behaviour.

cheers

# cat input.c
struct {
  void *iov_base;
  unsigned iov_len;
} * c;
long v;
void *a;
int b;
unsigned bar();
foo(unsigned p1) {
  unsigned d, e = p1;
  if (p1 == 0)
goto out;
  if (p1 > 4)
goto out;
  if (__builtin_expect(!!(0, v && a), 1))
e = bar();
  if (e)
barf(e);
  if (e)
goto out;
  d = 0;
  for (; d < p1; d++) {
int f = c[d].iov_len;
if (__builtin_expect(c[d].iov_base && f, 0))
  b = 4;
  }
out:;
}

$ cat output.s 
.file   "input.c"

 # rs6000/powerpc options: -mcpu=powerpc -msdata=data -G 8
 # GNU C (GCC) version 4.6.3 (powerpc64-linux)
 #  compiled by GNU C version 4.3.2, GMP version 4.3.2, MPFR version 2.4.2, 
MPC version 0.8.2
 # ...

 # Compiler executable checksum: 4b51a6b899110d06c9e3310ac66ad26c

.section".text"
.align 2
.globl foo
.type   foo, @function
foo:
cmpwi 0,3,0  # tmp169, p1
stwu 1,-16(1)#,,
mflr 0   #,
stw 0,20(1)  #,
beq- 0,.L9   #
cmplwi 7,3,4 #, tmp170, p1
bgt- 7,.L9   #
lis 9,v@ha   # tmp172,
lwz 0,v@l(9) # v, v
cmpwi 7,0,0  #, tmp174, v
beq- 7,.L3   #
lis 9,a@ha   # tmp176,
lwz 0,a@l(9) # a, a
cmpwi 7,0,0  #, tmp178, a
beq- 7,.L3   #
bl bar   #
cmpwi 0,3,0  # tmp179, e
beq+ 0,.L4   #
.L3:
bl barf  #
b .L9#
.L4:
lis 8,0x2000 #,
lis 9,c@ha   # tmp181,
mtctr 8  # tmp192,
lwz 11,c@l(9)# c, c.3
lis 10,b@ha  # tmp190,
li 9,0   # ivtmp.12,
li 0,4   # tmp191,
.L6:
lwzx 7,11,9  # MEM[base: c.3_14, index: ivtmp.12_25, offset: 0B], 
MEM[base: c.3_14, index: ivtmp.12_25, offset: 0B]
add 8,11,9   # tmp182, c.3, ivtmp.12
lwz 8,4(8)   # MEM[base: D.1310_21, offset: 4B], D.1287
cmpwi 7,7,0  #, tmp184, MEM[base: c.3_14, index: ivtmp.12_25, 
offset: 0B]
beq+ 7,.L5   #
cmpwi 7,8,0  #, tmp185, D.1287
beq+ 7,.L5   #
stw 0,b@l(10)# b, tmp191
.L5:
addi 9,9,8   # ivtmp.12, ivtmp.12,
bdnz .L6 #
.L2:
.L9:
lwz 0,20(1)  #,
addi 1,1,16  #,,
mtlr 0   #,
blr  #
.size   foo,.-foo
.globl b
.globl a
.globl v
.globl c
.section.sbss,"aw",@nobits
.align 2
.type   b, @object
.size   b, 4
b:
.zero   4
.type   a, @object
.size   a, 4
a:
.zero   4
.type   v, @object
.size   v, 4
v:
.zero   4
.type   c, @object
.size   c, 4
c:
.zero   4
.ident  "GCC: (GNU) 4.6.3"
.section.note.GNU-stack,"",@progbits


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-26 Thread Michael Ellerman
Al Viro  writes:

> On Sun, Jun 25, 2017 at 04:44:09PM -0500, Segher Boessenkool wrote:
>
>> Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
>> the actual problem may still exist in more recent compilers (if it _is_
>> a compiler problem; if it's not, you *really* want to know :-) )
>
> Enjoy.  At least 6.3 doesn't step into that.  Look for mtctr in the resulting
> asm...
>
> cat <<'EOF' >a.c
...

I pointed creduce at that and got the version below, which I'm pretty
sure still exhibits the weird mtctr behaviour.

cheers

# cat input.c
struct {
  void *iov_base;
  unsigned iov_len;
} * c;
long v;
void *a;
int b;
unsigned bar();
foo(unsigned p1) {
  unsigned d, e = p1;
  if (p1 == 0)
goto out;
  if (p1 > 4)
goto out;
  if (__builtin_expect(!!(0, v && a), 1))
e = bar();
  if (e)
barf(e);
  if (e)
goto out;
  d = 0;
  for (; d < p1; d++) {
int f = c[d].iov_len;
if (__builtin_expect(c[d].iov_base && f, 0))
  b = 4;
  }
out:;
}

$ cat output.s 
.file   "input.c"

 # rs6000/powerpc options: -mcpu=powerpc -msdata=data -G 8
 # GNU C (GCC) version 4.6.3 (powerpc64-linux)
 #  compiled by GNU C version 4.3.2, GMP version 4.3.2, MPFR version 2.4.2, 
MPC version 0.8.2
 # ...

 # Compiler executable checksum: 4b51a6b899110d06c9e3310ac66ad26c

.section".text"
.align 2
.globl foo
.type   foo, @function
foo:
cmpwi 0,3,0  # tmp169, p1
stwu 1,-16(1)#,,
mflr 0   #,
stw 0,20(1)  #,
beq- 0,.L9   #
cmplwi 7,3,4 #, tmp170, p1
bgt- 7,.L9   #
lis 9,v@ha   # tmp172,
lwz 0,v@l(9) # v, v
cmpwi 7,0,0  #, tmp174, v
beq- 7,.L3   #
lis 9,a@ha   # tmp176,
lwz 0,a@l(9) # a, a
cmpwi 7,0,0  #, tmp178, a
beq- 7,.L3   #
bl bar   #
cmpwi 0,3,0  # tmp179, e
beq+ 0,.L4   #
.L3:
bl barf  #
b .L9#
.L4:
lis 8,0x2000 #,
lis 9,c@ha   # tmp181,
mtctr 8  # tmp192,
lwz 11,c@l(9)# c, c.3
lis 10,b@ha  # tmp190,
li 9,0   # ivtmp.12,
li 0,4   # tmp191,
.L6:
lwzx 7,11,9  # MEM[base: c.3_14, index: ivtmp.12_25, offset: 0B], 
MEM[base: c.3_14, index: ivtmp.12_25, offset: 0B]
add 8,11,9   # tmp182, c.3, ivtmp.12
lwz 8,4(8)   # MEM[base: D.1310_21, offset: 4B], D.1287
cmpwi 7,7,0  #, tmp184, MEM[base: c.3_14, index: ivtmp.12_25, 
offset: 0B]
beq+ 7,.L5   #
cmpwi 7,8,0  #, tmp185, D.1287
beq+ 7,.L5   #
stw 0,b@l(10)# b, tmp191
.L5:
addi 9,9,8   # ivtmp.12, ivtmp.12,
bdnz .L6 #
.L2:
.L9:
lwz 0,20(1)  #,
addi 1,1,16  #,,
mtlr 0   #,
blr  #
.size   foo,.-foo
.globl b
.globl a
.globl v
.globl c
.section.sbss,"aw",@nobits
.align 2
.type   b, @object
.size   b, 4
b:
.zero   4
.type   a, @object
.size   a, 4
a:
.zero   4
.type   v, @object
.size   v, 4
v:
.zero   4
.type   c, @object
.size   c, 4
c:
.zero   4
.ident  "GCC: (GNU) 4.6.3"
.section.note.GNU-stack,"",@progbits


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 04:44:09PM -0500, Segher Boessenkool wrote:

> Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
> the actual problem may still exist in more recent compilers (if it _is_
> a compiler problem; if it's not, you *really* want to know :-) )

Enjoy.  At least 6.3 doesn't step into that.  Look for mtctr in the resulting
asm...

cat <<'EOF' >a.c
struct iovec
{
 void *iov_base;
 unsigned iov_len;
};

unsigned long v;

extern void * barf(void *,int,unsigned);

extern unsigned long bar(void *to, const void *from, unsigned long size);

static inline unsigned long __bar(void *to, const void *from, unsigned long n)
{
 unsigned long res = n;
 if (__builtin_expect(!!(((void)0,  unsigned long)(from)) <= v) && n)) 
== 0) || n)) - 1) <= (v - (( unsigned long)(from, 1))
  res = bar(to, from, n);
 if (res)
  barf(to + (n - res), 0, res);
 return res;
}

int foo(int type, const struct iovec * uvector,
 unsigned long nr_segs, unsigned long fast_segs,
 struct iovec *iov,
 struct iovec **ret_pointer)
{
 unsigned long seg;
 int ret;
 if (nr_segs == 0) {
  ret = 0;
  goto out;
 }
 if (nr_segs > 1024) {
  ret = -22;
  goto out;
 }
 if (__bar(iov, uvector, nr_segs*sizeof(*uvector))) {
  ret = -14;
  goto out;
 }
 ret = 0;
 for (seg = 0; seg < nr_segs; seg++) {
  void *buf = iov[seg].iov_base;
  int len = (int)iov[seg].iov_len;
  if (len < 0) {
   ret = -22;
   goto out;
  }
  if (type >= 0
  && __builtin_expect(!!(!((void)0,  unsigned long)(buf)) <= v) && 
len)) == 0) || len)) - 1) <= (v - (( unsigned long)(buf, 0)) {
   ret = -14;
   goto out;
  }
  ret += len;
 }
out:
 *ret_pointer = iov;
 return ret;
}
EOF
powerpc-linux-gcc -m32 -fno-strict-aliasing -fno-common -std=gnu89 -fno-PIE 
-msoft-float -pipe -ffixed-r2 -mmultiple -mno-altivec -mno-vsx -mno-spe 
-mspe=no -funit-at-a-time -fno-dwarf2-cfi-asm -mno-string -mcpu=powerpc 
-Wa,-maltivec -mbig-endian -fno-delete-null-pointer-checks -Os 
-fno-stack-protector -Wno-unused-but-set-variable -fomit-frame-pointer 
-fno-var-tracking-assignments -femit-struct-debug-baseonly -fno-var-tracking 
-fno-strict-overflow -fconserve-stack -fverbose-asm -S a.c


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 04:44:09PM -0500, Segher Boessenkool wrote:

> Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
> the actual problem may still exist in more recent compilers (if it _is_
> a compiler problem; if it's not, you *really* want to know :-) )

Enjoy.  At least 6.3 doesn't step into that.  Look for mtctr in the resulting
asm...

cat <<'EOF' >a.c
struct iovec
{
 void *iov_base;
 unsigned iov_len;
};

unsigned long v;

extern void * barf(void *,int,unsigned);

extern unsigned long bar(void *to, const void *from, unsigned long size);

static inline unsigned long __bar(void *to, const void *from, unsigned long n)
{
 unsigned long res = n;
 if (__builtin_expect(!!(((void)0,  unsigned long)(from)) <= v) && n)) 
== 0) || n)) - 1) <= (v - (( unsigned long)(from, 1))
  res = bar(to, from, n);
 if (res)
  barf(to + (n - res), 0, res);
 return res;
}

int foo(int type, const struct iovec * uvector,
 unsigned long nr_segs, unsigned long fast_segs,
 struct iovec *iov,
 struct iovec **ret_pointer)
{
 unsigned long seg;
 int ret;
 if (nr_segs == 0) {
  ret = 0;
  goto out;
 }
 if (nr_segs > 1024) {
  ret = -22;
  goto out;
 }
 if (__bar(iov, uvector, nr_segs*sizeof(*uvector))) {
  ret = -14;
  goto out;
 }
 ret = 0;
 for (seg = 0; seg < nr_segs; seg++) {
  void *buf = iov[seg].iov_base;
  int len = (int)iov[seg].iov_len;
  if (len < 0) {
   ret = -22;
   goto out;
  }
  if (type >= 0
  && __builtin_expect(!!(!((void)0,  unsigned long)(buf)) <= v) && 
len)) == 0) || len)) - 1) <= (v - (( unsigned long)(buf, 0)) {
   ret = -14;
   goto out;
  }
  ret += len;
 }
out:
 *ret_pointer = iov;
 return ret;
}
EOF
powerpc-linux-gcc -m32 -fno-strict-aliasing -fno-common -std=gnu89 -fno-PIE 
-msoft-float -pipe -ffixed-r2 -mmultiple -mno-altivec -mno-vsx -mno-spe 
-mspe=no -funit-at-a-time -fno-dwarf2-cfi-asm -mno-string -mcpu=powerpc 
-Wa,-maltivec -mbig-endian -fno-delete-null-pointer-checks -Os 
-fno-stack-protector -Wno-unused-but-set-variable -fomit-frame-pointer 
-fno-var-tracking-assignments -femit-struct-debug-baseonly -fno-var-tracking 
-fno-strict-overflow -fconserve-stack -fverbose-asm -S a.c


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Segher Boessenkool
On Sun, Jun 25, 2017 at 09:53:24PM +0100, Al Viro wrote:
> Confirmed.  It manages to bugger the loop immediately after the (successful)
> copying of iovec array in rw_copy_check_uvector(); both with and without
> INLINE_COPY_FROM_USER it has (just before the call of copy_from_user()) r27
> set to nr_segs * sizeof(struct iovec).  The call is made, we check that it
> has succeeded and that's when it hits the fan: without INLINE_COPY_FROM_USER
> we have (interleaved with unrelated insns)
> addi 27,27,-8
> srwi 27,27,3
> addi 27,27,1
> mtctr 27
> Weird, but manages to pass nr_segs to mtctr.

This weirdosity is https://gcc.gnu.org/PR67288 .  Those three instructions
are not the same as just  srwi 27,27,3  in case r27 is 0; GCC does not
figure out this cannot happen here.

> _With_ INLINE_COPY_FROM_USER we
> get this:
> lis 9,0x2000
> mtctr 9
> In other words, the loop will try to go through 8192 iterations.  No idea 
> where
> that number has come from, but it sure as hell is wrong.

8192*65535, even.  This is as if r27 was 0 always.

Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
the actual problem may still exist in more recent compilers (if it _is_
a compiler problem; if it's not, you *really* want to know :-) )


Segher


Re: gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Segher Boessenkool
On Sun, Jun 25, 2017 at 09:53:24PM +0100, Al Viro wrote:
> Confirmed.  It manages to bugger the loop immediately after the (successful)
> copying of iovec array in rw_copy_check_uvector(); both with and without
> INLINE_COPY_FROM_USER it has (just before the call of copy_from_user()) r27
> set to nr_segs * sizeof(struct iovec).  The call is made, we check that it
> has succeeded and that's when it hits the fan: without INLINE_COPY_FROM_USER
> we have (interleaved with unrelated insns)
> addi 27,27,-8
> srwi 27,27,3
> addi 27,27,1
> mtctr 27
> Weird, but manages to pass nr_segs to mtctr.

This weirdosity is https://gcc.gnu.org/PR67288 .  Those three instructions
are not the same as just  srwi 27,27,3  in case r27 is 0; GCC does not
figure out this cannot happen here.

> _With_ INLINE_COPY_FROM_USER we
> get this:
> lis 9,0x2000
> mtctr 9
> In other words, the loop will try to go through 8192 iterations.  No idea 
> where
> that number has come from, but it sure as hell is wrong.

8192*65535, even.  This is as if r27 was 0 always.

Do you have a short stand-alone testcase?  4.6 is ancient, of course, but
the actual problem may still exist in more recent compilers (if it _is_
a compiler problem; if it's not, you *really* want to know :-) )


Segher


gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 12:14:04PM +0100, Al Viro wrote:
> On Sun, Jun 25, 2017 at 10:53:58AM +0100, Al Viro wrote:
> > On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:
> > 
> > > I made a break through. If I turn off inline copy to/from users for 32-bit
> > > ppc with the following patch, then the system boots:
> > 
> > OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
> > reproduced in qemu.  I'd like to get more self-contained example of
> > miscompile, though; should be done by tonight...
> 
> OK, it's the call in rw_copy_check_uvector(); with INLINE_COPY_FROM_USER
> it's miscompiled by 4.6.3.  I hadn't looked through the generated code
> yet; will do that after I grab some sleep.

Confirmed.  It manages to bugger the loop immediately after the (successful)
copying of iovec array in rw_copy_check_uvector(); both with and without
INLINE_COPY_FROM_USER it has (just before the call of copy_from_user()) r27
set to nr_segs * sizeof(struct iovec).  The call is made, we check that it
has succeeded and that's when it hits the fan: without INLINE_COPY_FROM_USER
we have (interleaved with unrelated insns)
addi 27,27,-8
srwi 27,27,3
addi 27,27,1
mtctr 27
Weird, but manages to pass nr_segs to mtctr.  _With_ INLINE_COPY_FROM_USER we
get this:
lis 9,0x2000
mtctr 9
In other words, the loop will try to go through 8192 iterations.  No idea where
that number has come from, but it sure as hell is wrong.  That's where those
-EINVAL, etc. are coming from - we run into something negative in iov[seg].len,
after having run out of on-stack iovec array.

Assembler generated out of rw_copy_check_uvector() with and without
INLINE_COPY_FROM_USER is attached; it's a definite miscompile.  Neither 4.4.5
nor 6.3.0 use mtctr/bdnz for that loop.

The bottom line is, ppc cross-toolchain on kernel.org happens to be
the version that miscompiles rw_copy_check_uvector() with INLINE_COPY_FROM_USER
and hell knows what else.  Said that, I would rather have ppc32 drop the
INLINE_COPY_{TO,FROM}_USER anyway; that won't fix any other places where
the same 4.6.3 bug hits, but I seriously suspect that it will end up being
faster even on non^Wless buggy gcc versions.  Could powerpc folks check
what does removing those two defines from arch/powerpc/include/asm/uaccess.h
do to performance?  If there's no slowdown, I would strongly recommend just
removing those as in the patch Larry has posted upthread.

Fixing whatever it is in gcc 4.6.3 that triggers that behaviour is
IMO pointless - it might make sense to switch kernel.org cross-toolchain to
something more recent, but that's it.
.globl rw_copy_check_uvector
.type   rw_copy_check_uvector, @function
rw_copy_check_uvector:
.LFB2683:
.loc 1 773 0
stwu 1,-32(1)#,,
.LCFI142:
mflr 0   #,
.LCFI143:
stmw 27,12(1)#,
.LCFI144:
.loc 1 783 0
mr. 27,5 # nr_segs, nr_segs
.loc 1 773 0
mr 30,3  # type, type
stw 0,36(1)  #,
.LCFI145:
.loc 1 773 0
mr 31,4  # uvector, uvector
mr 29,8  # ret_pointer, ret_pointer
.loc 1 776 0
mr 28,7  # iov, fast_pointer
.loc 1 784 0
li 0,0   # ret,
.loc 1 783 0
beq- 0,.L495 #
.loc 1 792 0
cmplwi 7,27,1024 #, tmp160, nr_segs
.loc 1 793 0
li 0,-22 # ret,
.loc 1 792 0
bgt- 7,.L495 #
.loc 1 796 0
cmplw 7,27,6 # fast_segs, tmp161, nr_segs
ble- 7,.L496 #
.LBB1538:
.LBB1539:
.file 21 "./include/linux/slab.h"
.loc 21 495 0
lis 4,0x140  # tmp190,
slwi 3,27,3  #, nr_segs,
ori 4,4,192  #,, tmp190,
bl __kmalloc #
.LBE1539:
.LBE1538:
.loc 1 799 0
li 0,-12 # ret,
.loc 1 798 0
mr. 28,3 # iov,
beq- 0,.L495 #
.L496:
.LBB1540:
.LBB1541:
.LBB1542:
.LBB1543:
.loc 19 113 0
lwz 0,1128(2)# current.192_185->thread.fs.seg, D.39493
.LBE1543:
.LBE1542:
.LBE1541:
.LBE1540:
.loc 1 803 0
slwi 27,27,3 # n, nr_segs,
.LBB1549:
.LBB1548:
.LBB1547:
.LBB1546:
mr 5,27  # n, n
.loc 19 113 0
cmplw 7,31,0 # D.39493, tmp165, uvector
bgt- 7,.L497 #
addi 9,27,-1 # tmp166, n,
subf 0,31,0  # tmp167, uvector, D.39493
cmplw 7,9,0  # tmp167, tmp168, tmp166
bgt- 7,.L497 #
.LBB1544:
.LBB1545:
.file 22 "./arch/powerpc/include/asm/uaccess.h"
.loc 22 305 0
mr 3,28  #, iov
mr 4,31  #, uvector
bl __copy_tofrom_user#
.LBE1545:
.LBE1544:
.loc 19 115 0
mr. 5,3  # n,
beq+ 0,.L498 #
.L497:
.loc 19 116 0
subf 3,5,27  # tmp170, n, n
li 4,0   #,
add 3,28,3   #, iov, tmp170
bl memset#

gcc 4.6.3 miscompile on ppc32 (was Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3)

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 12:14:04PM +0100, Al Viro wrote:
> On Sun, Jun 25, 2017 at 10:53:58AM +0100, Al Viro wrote:
> > On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:
> > 
> > > I made a break through. If I turn off inline copy to/from users for 32-bit
> > > ppc with the following patch, then the system boots:
> > 
> > OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
> > reproduced in qemu.  I'd like to get more self-contained example of
> > miscompile, though; should be done by tonight...
> 
> OK, it's the call in rw_copy_check_uvector(); with INLINE_COPY_FROM_USER
> it's miscompiled by 4.6.3.  I hadn't looked through the generated code
> yet; will do that after I grab some sleep.

Confirmed.  It manages to bugger the loop immediately after the (successful)
copying of iovec array in rw_copy_check_uvector(); both with and without
INLINE_COPY_FROM_USER it has (just before the call of copy_from_user()) r27
set to nr_segs * sizeof(struct iovec).  The call is made, we check that it
has succeeded and that's when it hits the fan: without INLINE_COPY_FROM_USER
we have (interleaved with unrelated insns)
addi 27,27,-8
srwi 27,27,3
addi 27,27,1
mtctr 27
Weird, but manages to pass nr_segs to mtctr.  _With_ INLINE_COPY_FROM_USER we
get this:
lis 9,0x2000
mtctr 9
In other words, the loop will try to go through 8192 iterations.  No idea where
that number has come from, but it sure as hell is wrong.  That's where those
-EINVAL, etc. are coming from - we run into something negative in iov[seg].len,
after having run out of on-stack iovec array.

Assembler generated out of rw_copy_check_uvector() with and without
INLINE_COPY_FROM_USER is attached; it's a definite miscompile.  Neither 4.4.5
nor 6.3.0 use mtctr/bdnz for that loop.

The bottom line is, ppc cross-toolchain on kernel.org happens to be
the version that miscompiles rw_copy_check_uvector() with INLINE_COPY_FROM_USER
and hell knows what else.  Said that, I would rather have ppc32 drop the
INLINE_COPY_{TO,FROM}_USER anyway; that won't fix any other places where
the same 4.6.3 bug hits, but I seriously suspect that it will end up being
faster even on non^Wless buggy gcc versions.  Could powerpc folks check
what does removing those two defines from arch/powerpc/include/asm/uaccess.h
do to performance?  If there's no slowdown, I would strongly recommend just
removing those as in the patch Larry has posted upthread.

Fixing whatever it is in gcc 4.6.3 that triggers that behaviour is
IMO pointless - it might make sense to switch kernel.org cross-toolchain to
something more recent, but that's it.
.globl rw_copy_check_uvector
.type   rw_copy_check_uvector, @function
rw_copy_check_uvector:
.LFB2683:
.loc 1 773 0
stwu 1,-32(1)#,,
.LCFI142:
mflr 0   #,
.LCFI143:
stmw 27,12(1)#,
.LCFI144:
.loc 1 783 0
mr. 27,5 # nr_segs, nr_segs
.loc 1 773 0
mr 30,3  # type, type
stw 0,36(1)  #,
.LCFI145:
.loc 1 773 0
mr 31,4  # uvector, uvector
mr 29,8  # ret_pointer, ret_pointer
.loc 1 776 0
mr 28,7  # iov, fast_pointer
.loc 1 784 0
li 0,0   # ret,
.loc 1 783 0
beq- 0,.L495 #
.loc 1 792 0
cmplwi 7,27,1024 #, tmp160, nr_segs
.loc 1 793 0
li 0,-22 # ret,
.loc 1 792 0
bgt- 7,.L495 #
.loc 1 796 0
cmplw 7,27,6 # fast_segs, tmp161, nr_segs
ble- 7,.L496 #
.LBB1538:
.LBB1539:
.file 21 "./include/linux/slab.h"
.loc 21 495 0
lis 4,0x140  # tmp190,
slwi 3,27,3  #, nr_segs,
ori 4,4,192  #,, tmp190,
bl __kmalloc #
.LBE1539:
.LBE1538:
.loc 1 799 0
li 0,-12 # ret,
.loc 1 798 0
mr. 28,3 # iov,
beq- 0,.L495 #
.L496:
.LBB1540:
.LBB1541:
.LBB1542:
.LBB1543:
.loc 19 113 0
lwz 0,1128(2)# current.192_185->thread.fs.seg, D.39493
.LBE1543:
.LBE1542:
.LBE1541:
.LBE1540:
.loc 1 803 0
slwi 27,27,3 # n, nr_segs,
.LBB1549:
.LBB1548:
.LBB1547:
.LBB1546:
mr 5,27  # n, n
.loc 19 113 0
cmplw 7,31,0 # D.39493, tmp165, uvector
bgt- 7,.L497 #
addi 9,27,-1 # tmp166, n,
subf 0,31,0  # tmp167, uvector, D.39493
cmplw 7,9,0  # tmp167, tmp168, tmp166
bgt- 7,.L497 #
.LBB1544:
.LBB1545:
.file 22 "./arch/powerpc/include/asm/uaccess.h"
.loc 22 305 0
mr 3,28  #, iov
mr 4,31  #, uvector
bl __copy_tofrom_user#
.LBE1545:
.LBE1544:
.loc 19 115 0
mr. 5,3  # n,
beq+ 0,.L498 #
.L497:
.loc 19 116 0
subf 3,5,27  # tmp170, n, n
li 4,0   #,
add 3,28,3   #, iov, tmp170
bl memset#

Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 10:53:58AM +0100, Al Viro wrote:
> On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:
> 
> > I made a break through. If I turn off inline copy to/from users for 32-bit
> > ppc with the following patch, then the system boots:
> 
> OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
> reproduced in qemu.  I'd like to get more self-contained example of
> miscompile, though; should be done by tonight...

OK, it's the call in rw_copy_check_uvector(); with INLINE_COPY_FROM_USER
it's miscompiled by 4.6.3.  I hadn't looked through the generated code
yet; will do that after I grab some sleep.


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-25 Thread Al Viro
On Sun, Jun 25, 2017 at 10:53:58AM +0100, Al Viro wrote:
> On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:
> 
> > I made a break through. If I turn off inline copy to/from users for 32-bit
> > ppc with the following patch, then the system boots:
> 
> OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
> reproduced in qemu.  I'd like to get more self-contained example of
> miscompile, though; should be done by tonight...

OK, it's the call in rw_copy_check_uvector(); with INLINE_COPY_FROM_USER
it's miscompiled by 4.6.3.  I hadn't looked through the generated code
yet; will do that after I grab some sleep.


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-25 Thread Al Viro
On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:

> I made a break through. If I turn off inline copy to/from users for 32-bit
> ppc with the following patch, then the system boots:

OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
reproduced in qemu.  I'd like to get more self-contained example of
miscompile, though; should be done by tonight...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-25 Thread Al Viro
On Sat, Jun 24, 2017 at 12:29:23PM -0500, Larry Finger wrote:

> I made a break through. If I turn off inline copy to/from users for 32-bit
> ppc with the following patch, then the system boots:

OK...  So it's 4.6.3 miscompiling something - it is hardware-independent,
reproduced in qemu.  I'd like to get more self-contained example of
miscompile, though; should be done by tonight...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-24 Thread Larry Finger

On 06/23/2017 03:29 PM, Al Viro wrote:

On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:


BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.


Sorry I was gone so long. Installing jessie on this box resulted in a crash
on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
Unity, but I guess I'm stuck for now.


Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
different...  Ubuntu 12.04 is what, 3.2?


I know how easy it is to screw up a long bisection by booting the wrong
kernel. To help that problem and to work around the yaconf/yboot nonsense on
the MAC, my /etc/yaconf has always had generic kernel stanzas with only
default, old, and original kernels mentioned. From there I use a local
script to finish a kernel installation by moving the default links to the
old ones and creating the new default links pointing to the current kernel.
With those long-tested scripts, I'm sure that I am booting the one I want.

With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
backported 46f401c4 added.

Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
effect.


OK, that simplifies things a bit.  Just to make sure we are on the same page:

* f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
* 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
   with removal of constant-size bits in raw_copy_..._user().  Failure appears
   to be on udev getting EFAULT on some syscalls.
* straight Ubuntu 12.04 works
* jessie crashes on boot.


I made a break through. If I turn off inline copy to/from users for 32-bit ppc 
with the following patch, then the system boots:


diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 5c0d8a8cdae5..1e6a8723f497 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -267,12 +267,7 @@ do { 
   \

 extern unsigned long __copy_tofrom_user(void __user *to,
const void __user *from, unsigned long size);

-#ifndef __powerpc64__
-
-#define INLINE_COPY_FROM_USER
-#define INLINE_COPY_TO_USE
-
-#else /* __powerpc64__ */
+#ifdef __powerpc64__

 static inline unsigned long
 raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)

It seems whatever problem I am seeing is in the inline version of 
_copy_to_user() and _copy_from_user() on the 32-bit ppc. The only other 
difference between the two versions is the placement of the __user macro, which 
looks to be wrong in the non-inlined version of _copy_to_user() in 
lib/usercopy.c, but that is the one that works.


To me, this looks like a compiler error. On the PowerBook, 'gcc --version' 
reports "gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3".


I will prepare a proper patch that I will send to you privately. If you agree 
with it, it can be send through normal channels in time for the release of 4.12.


Larry



Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-24 Thread Larry Finger

On 06/23/2017 03:29 PM, Al Viro wrote:

On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:


BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.


Sorry I was gone so long. Installing jessie on this box resulted in a crash
on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
Unity, but I guess I'm stuck for now.


Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
different...  Ubuntu 12.04 is what, 3.2?


I know how easy it is to screw up a long bisection by booting the wrong
kernel. To help that problem and to work around the yaconf/yboot nonsense on
the MAC, my /etc/yaconf has always had generic kernel stanzas with only
default, old, and original kernels mentioned. From there I use a local
script to finish a kernel installation by moving the default links to the
old ones and creating the new default links pointing to the current kernel.
With those long-tested scripts, I'm sure that I am booting the one I want.

With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
backported 46f401c4 added.

Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
effect.


OK, that simplifies things a bit.  Just to make sure we are on the same page:

* f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
* 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
   with removal of constant-size bits in raw_copy_..._user().  Failure appears
   to be on udev getting EFAULT on some syscalls.
* straight Ubuntu 12.04 works
* jessie crashes on boot.


I made a break through. If I turn off inline copy to/from users for 32-bit ppc 
with the following patch, then the system boots:


diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 5c0d8a8cdae5..1e6a8723f497 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -267,12 +267,7 @@ do { 
   \

 extern unsigned long __copy_tofrom_user(void __user *to,
const void __user *from, unsigned long size);

-#ifndef __powerpc64__
-
-#define INLINE_COPY_FROM_USER
-#define INLINE_COPY_TO_USE
-
-#else /* __powerpc64__ */
+#ifdef __powerpc64__

 static inline unsigned long
 raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)

It seems whatever problem I am seeing is in the inline version of 
_copy_to_user() and _copy_from_user() on the 32-bit ppc. The only other 
difference between the two versions is the placement of the __user macro, which 
looks to be wrong in the non-inlined version of _copy_to_user() in 
lib/usercopy.c, but that is the one that works.


To me, this looks like a compiler error. On the PowerBook, 'gcc --version' 
reports "gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3".


I will prepare a proper patch that I will send to you privately. If you agree 
with it, it can be send through normal channels in time for the release of 4.12.


Larry



Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-23 Thread Al Viro
On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:

> > BTW, could you try to check what happens if you kill the
> > if (__builtin_constant_p(n) && (n <= 8))
> > bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
> > __copy_from_user()
> > originally) had always been dubious and the things are simpler without them.
> > If _that_ turns out to cure breakage, I would be very surprised, though.
> > 
> Sorry I was gone so long. Installing jessie on this box resulted in a crash
> on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
> nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
> Unity, but I guess I'm stuck for now.

Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
different...  Ubuntu 12.04 is what, 3.2?

> I know how easy it is to screw up a long bisection by booting the wrong
> kernel. To help that problem and to work around the yaconf/yboot nonsense on
> the MAC, my /etc/yaconf has always had generic kernel stanzas with only
> default, old, and original kernels mentioned. From there I use a local
> script to finish a kernel installation by moving the default links to the
> old ones and creating the new default links pointing to the current kernel.
> With those long-tested scripts, I'm sure that I am booting the one I want.
> 
> With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
> backported 46f401c4 added.
> 
> Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
> effect.

OK, that simplifies things a bit.  Just to make sure we are on the same page:

* f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
* 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
  with removal of constant-size bits in raw_copy_..._user().  Failure appears
  to be on udev getting EFAULT on some syscalls.
* straight Ubuntu 12.04 works
* jessie crashes on boot.

Could you post the boot logs of the first two?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-23 Thread Al Viro
On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:

> > BTW, could you try to check what happens if you kill the
> > if (__builtin_constant_p(n) && (n <= 8))
> > bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
> > __copy_from_user()
> > originally) had always been dubious and the things are simpler without them.
> > If _that_ turns out to cure breakage, I would be very surprised, though.
> > 
> Sorry I was gone so long. Installing jessie on this box resulted in a crash
> on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
> nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
> Unity, but I guess I'm stuck for now.

Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
different...  Ubuntu 12.04 is what, 3.2?

> I know how easy it is to screw up a long bisection by booting the wrong
> kernel. To help that problem and to work around the yaconf/yboot nonsense on
> the MAC, my /etc/yaconf has always had generic kernel stanzas with only
> default, old, and original kernels mentioned. From there I use a local
> script to finish a kernel installation by moving the default links to the
> old ones and creating the new default links pointing to the current kernel.
> With those long-tested scripts, I'm sure that I am booting the one I want.
> 
> With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
> backported 46f401c4 added.
> 
> Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
> effect.

OK, that simplifies things a bit.  Just to make sure we are on the same page:

* f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
* 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
  with removal of constant-size bits in raw_copy_..._user().  Failure appears
  to be on udev getting EFAULT on some syscalls.
* straight Ubuntu 12.04 works
* jessie crashes on boot.

Could you post the boot logs of the first two?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-23 Thread Larry Finger

On 06/22/2017 02:25 PM, Al Viro wrote:

On Thu, Jun 22, 2017 at 09:19:58AM -0500, Larry Finger wrote:


Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...
  

Any chance that real hardware differs from KVM emulation?


For that one?  Bloody unlikely; udev could, theoretically, hit different 
codepaths
due to different devices being observed, etc., but changes in that commit are
not in the areas that would be easy to get wrong in emulator.


All I know at this
point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
3448890c with the same backport fails.

I will try loading jessie and see what happens.


I would recheck which kernels are being booted - I had screwed that up during 
long
bisects often enough...

BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.

Sorry I was gone so long. Installing jessie on this box resulted in a crash on 
boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but nothing 
else. Finally, Ubuntu 12.04 resulted in a working system. I hate Unity, but I 
guess I'm stuck for now.


I know how easy it is to screw up a long bisection by booting the wrong kernel. 
To help that problem and to work around the yaconf/yboot nonsense on the MAC, my 
/etc/yaconf has always had generic kernel stanzas with only default, old, and 
original kernels mentioned. From there I use a local script to finish a kernel 
installation by moving the default links to the old ones and creating the new 
default links pointing to the current kernel. With those long-tested scripts, 
I'm sure that I am booting the one I want.


With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the 
backported 46f401c4 added.


Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
effect.

Larry



Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-23 Thread Larry Finger

On 06/22/2017 02:25 PM, Al Viro wrote:

On Thu, Jun 22, 2017 at 09:19:58AM -0500, Larry Finger wrote:


Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...
  

Any chance that real hardware differs from KVM emulation?


For that one?  Bloody unlikely; udev could, theoretically, hit different 
codepaths
due to different devices being observed, etc., but changes in that commit are
not in the areas that would be easy to get wrong in emulator.


All I know at this
point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
3448890c with the same backport fails.

I will try loading jessie and see what happens.


I would recheck which kernels are being booted - I had screwed that up during 
long
bisects often enough...

BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.

Sorry I was gone so long. Installing jessie on this box resulted in a crash on 
boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but nothing 
else. Finally, Ubuntu 12.04 resulted in a working system. I hate Unity, but I 
guess I'm stuck for now.


I know how easy it is to screw up a long bisection by booting the wrong kernel. 
To help that problem and to work around the yaconf/yboot nonsense on the MAC, my 
/etc/yaconf has always had generic kernel stanzas with only default, old, and 
original kernels mentioned. From there I use a local script to finish a kernel 
installation by moving the default links to the old ones and creating the new 
default links pointing to the current kernel. With those long-tested scripts, 
I'm sure that I am booting the one I want.


With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the 
backported 46f401c4 added.


Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
effect.

Larry



Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Thu, Jun 22, 2017 at 08:25:16PM +0100, Al Viro wrote:
> > All I know at this
> > point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
> > 3448890c with the same backport fails.
> > 
> > I will try loading jessie and see what happens.
> 
> I would recheck which kernels are being booted - I had screwed that up during 
> long
> bisects often enough...
> 
> BTW, could you try to check what happens if you kill the
>   if (__builtin_constant_p(n) && (n <= 8))
> bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
> __copy_from_user()
> originally) had always been dubious and the things are simpler without them.
> If _that_ turns out to cure breakage, I would be very surprised, though.

FWIW, having dug through the __copy_tofrom_user() change in 3448890c, I don't 
see
anything that would be likely to cause that effect, be it on hardware or 
emulated.
Moreover, had that been fucked up, I would've expected lots and lots of folks
screaming by now - boot being broken since -rc1 tends to have such effect, even
if nobody had noticed that in -next last cycle.

What I can prove is that
* __copy_tofrom_user() return value is unchanged in all cases
* the only difference in its behaviour is that prior to that commit
some cases when it returns non-zero used to do memset(dest + something, 0,
retval) and now they do not.  _All_ such cases must have stepped into a fault
on load from src + something.

And looking through arch/powerpc callers of all that bunch, I don't see any
candidates for being buggered by disappearing memset() on partial copy with
faulting read; note that copy_from_user() *will* memset() explicitly if
raw_copy_from_user() returns non-zero.  I wondered if it could be a weird
case when copy_to_user() had been running into an unmapped area of *source*
and proceeded to zero the tail of destination, but I don't see anything
likely in arch/powerpc and anything in arch-independent code would've been
oopsing on that all along for some architectures...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Thu, Jun 22, 2017 at 08:25:16PM +0100, Al Viro wrote:
> > All I know at this
> > point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
> > 3448890c with the same backport fails.
> > 
> > I will try loading jessie and see what happens.
> 
> I would recheck which kernels are being booted - I had screwed that up during 
> long
> bisects often enough...
> 
> BTW, could you try to check what happens if you kill the
>   if (__builtin_constant_p(n) && (n <= 8))
> bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
> __copy_from_user()
> originally) had always been dubious and the things are simpler without them.
> If _that_ turns out to cure breakage, I would be very surprised, though.

FWIW, having dug through the __copy_tofrom_user() change in 3448890c, I don't 
see
anything that would be likely to cause that effect, be it on hardware or 
emulated.
Moreover, had that been fucked up, I would've expected lots and lots of folks
screaming by now - boot being broken since -rc1 tends to have such effect, even
if nobody had noticed that in -next last cycle.

What I can prove is that
* __copy_tofrom_user() return value is unchanged in all cases
* the only difference in its behaviour is that prior to that commit
some cases when it returns non-zero used to do memset(dest + something, 0,
retval) and now they do not.  _All_ such cases must have stepped into a fault
on load from src + something.

And looking through arch/powerpc callers of all that bunch, I don't see any
candidates for being buggered by disappearing memset() on partial copy with
faulting read; note that copy_from_user() *will* memset() explicitly if
raw_copy_from_user() returns non-zero.  I wondered if it could be a weird
case when copy_to_user() had been running into an unmapped area of *source*
and proceeded to zero the tail of destination, but I don't see anything
likely in arch/powerpc and anything in arch-independent code would've been
oopsing on that all along for some architectures...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Thu, Jun 22, 2017 at 09:19:58AM -0500, Larry Finger wrote:

> > Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
> > jessie or wheezy - no difference in result) booting the commit in
> > question with your .config oopses as soon as pata_macio is initialized,
> > due to the bug in "treewide: Move dma_ops from struct dev_archdata into
> > struct device", and after cherry-picking your own fix for that (commit
> > 46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
> > the result boots just fine.
> > 
> > Again, that happens both for Debian 8 and Debian 7 userlands, so unless
> > Mint had been doing something very odd there, I would question the accuracy
> > of your bisect...
 
> Any chance that real hardware differs from KVM emulation?

For that one?  Bloody unlikely; udev could, theoretically, hit different 
codepaths
due to different devices being observed, etc., but changes in that commit are
not in the areas that would be easy to get wrong in emulator.

> All I know at this
> point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
> 3448890c with the same backport fails.
> 
> I will try loading jessie and see what happens.

I would recheck which kernels are being booted - I had screwed that up during 
long
bisects often enough...

BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Thu, Jun 22, 2017 at 09:19:58AM -0500, Larry Finger wrote:

> > Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
> > jessie or wheezy - no difference in result) booting the commit in
> > question with your .config oopses as soon as pata_macio is initialized,
> > due to the bug in "treewide: Move dma_ops from struct dev_archdata into
> > struct device", and after cherry-picking your own fix for that (commit
> > 46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
> > the result boots just fine.
> > 
> > Again, that happens both for Debian 8 and Debian 7 userlands, so unless
> > Mint had been doing something very odd there, I would question the accuracy
> > of your bisect...
 
> Any chance that real hardware differs from KVM emulation?

For that one?  Bloody unlikely; udev could, theoretically, hit different 
codepaths
due to different devices being observed, etc., but changes in that commit are
not in the areas that would be easy to get wrong in emulator.

> All I know at this
> point is that commit f2ed8beb with 46f401c4 backported boots OK and commit
> 3448890c with the same backport fails.
> 
> I will try loading jessie and see what happens.

I would recheck which kernels are being booted - I had screwed that up during 
long
bisects often enough...

BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Larry Finger

On 06/22/2017 09:12 AM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:49:46PM -0500, Larry Finger wrote:

On 06/21/2017 04:34 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:

On 06/21/2017 04:22 PM, Al Viro wrote:

How about the .config that works on parent of that commit?


Attached.


OK... am I right assuming straight jessie/powerpc userland?



Actually Mint 12 with noting fancy. The machine mainly exists to test the
wireless drivers work on big-endian hardware.

'cat /etc/issue' reports "Debian GNU/Linux 7".


Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...


Any chance that real hardware differs from KVM emulation? All I know at this 
point is that commit f2ed8beb with 46f401c4 backported boots OK and commit 
3448890c with the same backport fails.


I will try loading jessie and see what happens.

Thanks for investigating.

Larry




Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Larry Finger

On 06/22/2017 09:12 AM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:49:46PM -0500, Larry Finger wrote:

On 06/21/2017 04:34 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:

On 06/21/2017 04:22 PM, Al Viro wrote:

How about the .config that works on parent of that commit?


Attached.


OK... am I right assuming straight jessie/powerpc userland?



Actually Mint 12 with noting fancy. The machine mainly exists to test the
wireless drivers work on big-endian hardware.

'cat /etc/issue' reports "Debian GNU/Linux 7".


Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...


Any chance that real hardware differs from KVM emulation? All I know at this 
point is that commit f2ed8beb with 46f401c4 backported boots OK and commit 
3448890c with the same backport fails.


I will try loading jessie and see what happens.

Thanks for investigating.

Larry




Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Wed, Jun 21, 2017 at 04:49:46PM -0500, Larry Finger wrote:
> On 06/21/2017 04:34 PM, Al Viro wrote:
> > On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:
> > > On 06/21/2017 04:22 PM, Al Viro wrote:
> > > > How about the .config that works on parent of that commit?
> > > 
> > > Attached.
> > 
> > OK... am I right assuming straight jessie/powerpc userland?
> > 
> 
> Actually Mint 12 with noting fancy. The machine mainly exists to test the
> wireless drivers work on big-endian hardware.
> 
> 'cat /etc/issue' reports "Debian GNU/Linux 7".

Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-22 Thread Al Viro
On Wed, Jun 21, 2017 at 04:49:46PM -0500, Larry Finger wrote:
> On 06/21/2017 04:34 PM, Al Viro wrote:
> > On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:
> > > On 06/21/2017 04:22 PM, Al Viro wrote:
> > > > How about the .config that works on parent of that commit?
> > > 
> > > Attached.
> > 
> > OK... am I right assuming straight jessie/powerpc userland?
> > 
> 
> Actually Mint 12 with noting fancy. The machine mainly exists to test the
> wireless drivers work on big-endian hardware.
> 
> 'cat /etc/issue' reports "Debian GNU/Linux 7".

Ugh...  MintPPC appears to be dead.  On KVM with Debian userland (either
jessie or wheezy - no difference in result) booting the commit in
question with your .config oopses as soon as pata_macio is initialized,
due to the bug in "treewide: Move dma_ops from struct dev_archdata into
struct device", and after cherry-picking your own fix for that (commit
46f401c4297a "powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops")
the result boots just fine.

Again, that happens both for Debian 8 and Debian 7 userlands, so unless
Mint had been doing something very odd there, I would question the accuracy
of your bisect...


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/21/2017 04:34 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:

On 06/21/2017 04:22 PM, Al Viro wrote:

How about the .config that works on parent of that commit?


Attached.


OK... am I right assuming straight jessie/powerpc userland?



Actually Mint 12 with noting fancy. The machine mainly exists to test the 
wireless drivers work on big-endian hardware.


'cat /etc/issue' reports "Debian GNU/Linux 7".

Larry

Larry






Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/21/2017 04:34 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:

On 06/21/2017 04:22 PM, Al Viro wrote:

How about the .config that works on parent of that commit?


Attached.


OK... am I right assuming straight jessie/powerpc userland?



Actually Mint 12 with noting fancy. The machine mainly exists to test the 
wireless drivers work on big-endian hardware.


'cat /etc/issue' reports "Debian GNU/Linux 7".

Larry

Larry






Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Al Viro
On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:
> On 06/21/2017 04:22 PM, Al Viro wrote:
> > On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:
> > 
> > > I finally finished the bisection by patching each commit that was affected
> > > by the bootstrap crash. The faulty change is commit
> > > 3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
> > > switch to RAW_COPY_USER"). I am very confident in the bisection.
> > > 
> > > As I know nothing of assembly for ppc32, I have not been able to attempt 
> > > to
> > > find the problem with these patches.
> > 
> > How about the .config that works on parent of that commit?
> 
> Attached.

OK... am I right assuming straight jessie/powerpc userland?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Al Viro
On Wed, Jun 21, 2017 at 04:31:40PM -0500, Larry Finger wrote:
> On 06/21/2017 04:22 PM, Al Viro wrote:
> > On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:
> > 
> > > I finally finished the bisection by patching each commit that was affected
> > > by the bootstrap crash. The faulty change is commit
> > > 3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
> > > switch to RAW_COPY_USER"). I am very confident in the bisection.
> > > 
> > > As I know nothing of assembly for ppc32, I have not been able to attempt 
> > > to
> > > find the problem with these patches.
> > 
> > How about the .config that works on parent of that commit?
> 
> Attached.

OK... am I right assuming straight jessie/powerpc userland?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/21/2017 04:22 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:


I finally finished the bisection by patching each commit that was affected
by the bootstrap crash. The faulty change is commit
3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER"). I am very confident in the bisection.

As I know nothing of assembly for ppc32, I have not been able to attempt to
find the problem with these patches.


How about the .config that works on parent of that commit?


Attached.

Laeey

#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 4.11.0-rc1 Kernel Configuration
#
# CONFIG_PPC64 is not set

#
# Processor support
#
CONFIG_PPC_BOOK3S_32=y
# CONFIG_PPC_85xx is not set
# CONFIG_PPC_8xx is not set
# CONFIG_40x is not set
# CONFIG_44x is not set
# CONFIG_E200 is not set
CONFIG_PPC_BOOK3S=y
CONFIG_6xx=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_STD_MMU_32=y
# CONFIG_PPC_MM_SLICES is not set
CONFIG_PPC_HAVE_PMU_SUPPORT=y
CONFIG_PPC_PERF_CTRS=y
# CONFIG_SMP is not set
# CONFIG_PPC_DOORBELL is not set
CONFIG_VDSO32=y
CONFIG_CPU_BIG_ENDIAN=y
CONFIG_PPC32=y
CONFIG_32BIT=y
# CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set
# CONFIG_ARCH_DMA_ADDR_T_64BIT is not set
CONFIG_MMU=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
# CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK is not set
CONFIG_NR_IRQS=512
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_ILOG2_U32=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK=y
CONFIG_PPC=y
# CONFIG_GENERIC_CSUM is not set
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=180
CONFIG_GENERIC_NVRAM=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
# CONFIG_GENERIC_TBSYNC is not set
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_EPAPR_BOOT is not set
# CONFIG_DEFAULT_UIMAGE is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_EMULATE_SSTEP=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_KERNEL_GZIP=y
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
CONFIG_IRQ_DOMAIN=y
CONFIG_GENERIC_MSI_IRQ=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_GENERIC_TIME_VSYSCALL_OLD=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
# CONFIG_RCU_STALL_COMMON is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_BUILD_BIN2C=y
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=20
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_SOCK_CGROUP_DATA is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_INITRAMFS_COMPRESSION=".gz"
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
# CONFIG_EXPERT is not 

Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/21/2017 04:22 PM, Al Viro wrote:

On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:


I finally finished the bisection by patching each commit that was affected
by the bootstrap crash. The faulty change is commit
3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
switch to RAW_COPY_USER"). I am very confident in the bisection.

As I know nothing of assembly for ppc32, I have not been able to attempt to
find the problem with these patches.


How about the .config that works on parent of that commit?


Attached.

Laeey

#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 4.11.0-rc1 Kernel Configuration
#
# CONFIG_PPC64 is not set

#
# Processor support
#
CONFIG_PPC_BOOK3S_32=y
# CONFIG_PPC_85xx is not set
# CONFIG_PPC_8xx is not set
# CONFIG_40x is not set
# CONFIG_44x is not set
# CONFIG_E200 is not set
CONFIG_PPC_BOOK3S=y
CONFIG_6xx=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_STD_MMU_32=y
# CONFIG_PPC_MM_SLICES is not set
CONFIG_PPC_HAVE_PMU_SUPPORT=y
CONFIG_PPC_PERF_CTRS=y
# CONFIG_SMP is not set
# CONFIG_PPC_DOORBELL is not set
CONFIG_VDSO32=y
CONFIG_CPU_BIG_ENDIAN=y
CONFIG_PPC32=y
CONFIG_32BIT=y
# CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set
# CONFIG_ARCH_DMA_ADDR_T_64BIT is not set
CONFIG_MMU=y
# CONFIG_HAVE_SETUP_PER_CPU_AREA is not set
# CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK is not set
CONFIG_NR_IRQS=512
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_ILOG2_U32=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK=y
CONFIG_PPC=y
# CONFIG_GENERIC_CSUM is not set
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=180
CONFIG_GENERIC_NVRAM=y
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
# CONFIG_GENERIC_TBSYNC is not set
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_EPAPR_BOOT is not set
# CONFIG_DEFAULT_UIMAGE is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_EMULATE_SSTEP=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_KERNEL_GZIP=y
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
CONFIG_IRQ_DOMAIN=y
CONFIG_GENERIC_MSI_IRQ=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_GENERIC_TIME_VSYSCALL_OLD=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
# CONFIG_RCU_STALL_COMMON is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_BUILD_BIN2C=y
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=20
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CPUSETS=y
CONFIG_PROC_PID_CPUSET=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
# CONFIG_SOCK_CGROUP_DATA is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_INITRAMFS_COMPRESSION=".gz"
# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
# CONFIG_EXPERT is not 

Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Al Viro
On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:

> I finally finished the bisection by patching each commit that was affected
> by the bootstrap crash. The faulty change is commit
> 3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
> switch to RAW_COPY_USER"). I am very confident in the bisection.
> 
> As I know nothing of assembly for ppc32, I have not been able to attempt to
> find the problem with these patches.

How about the .config that works on parent of that commit?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Al Viro
On Wed, Jun 21, 2017 at 10:10:57AM -0500, Larry Finger wrote:

> I finally finished the bisection by patching each commit that was affected
> by the bootstrap crash. The faulty change is commit
> 3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing,
> switch to RAW_COPY_USER"). I am very confident in the bisection.
> 
> As I know nothing of assembly for ppc32, I have not been able to attempt to
> find the problem with these patches.

How about the .config that works on parent of that commit?


Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/01/2017 11:39 AM, Larry Finger wrote:
On my Powerbook G4 aluminum, kernel 4.12-rc1 fails to boot. I cannot save a copy 
of the early printk messages and I will need to summarize.


The kernel finds the hard driver and recognizes the various partitions. All 
seems normal until after the "unused kernel memory is freed" and that "This 
architecture does not have kernel memory protection", which are both normal. The 
next batch of logged messages are


Loading, please wait...
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
udevd[64]: Unable to receive ctrl message: Bad address.
Begin: Loading essential drivers ... modprobe: chdir: chdir(4.12.0-rc1): No such 
file or directory

done.
Begin: Running /scripts/init-premount ... done.
udevd[64]: Begin: Waiting for root file system ... [   11.651175] random: faast 
init done

done.
Gave up waiting for root device. Common problems:
  - Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Check root= (Did the system wait for the right device?)
  - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/ does not exist. Dropping to a shell!
modprobe: chdir(4.12-rc1): No such file or directory 

BusyBox v1.20.2 (Debian 1:1.20.0-7) built-in shell (ash)
Enter 'help' for a list of built-in commands.

/bin/sh: can't access tty: Job control turned off
(initramfs)

At that point, the system is dead. I have tried bisecting this issue; however, I 
run into a second problem that crashes the bootstrap.


I finally finished the bisection by patching each commit that was affected by 
the bootstrap crash. The faulty change is commit 
3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing, switch 
to RAW_COPY_USER"). I am very confident in the bisection.


As I know nothing of assembly for ppc32, I have not been able to attempt to find 
the problem with these patches.


Larry




Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-21 Thread Larry Finger

On 06/01/2017 11:39 AM, Larry Finger wrote:
On my Powerbook G4 aluminum, kernel 4.12-rc1 fails to boot. I cannot save a copy 
of the early printk messages and I will need to summarize.


The kernel finds the hard driver and recognizes the various partitions. All 
seems normal until after the "unused kernel memory is freed" and that "This 
architecture does not have kernel memory protection", which are both normal. The 
next batch of logged messages are


Loading, please wait...
udevd[64]: starting version 175
udevd[64]: Unable to receive ctrl message: Bad address.
modprobe: chdir(4.12-rc1): No such file or directory
udevd[64]: Unable to receive ctrl message: Bad address.
Begin: Loading essential drivers ... modprobe: chdir: chdir(4.12.0-rc1): No such 
file or directory

done.
Begin: Running /scripts/init-premount ... done.
udevd[64]: Begin: Waiting for root file system ... [   11.651175] random: faast 
init done

done.
Gave up waiting for root device. Common problems:
  - Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Check root= (Did the system wait for the right device?)
  - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/ does not exist. Dropping to a shell!
modprobe: chdir(4.12-rc1): No such file or directory 

BusyBox v1.20.2 (Debian 1:1.20.0-7) built-in shell (ash)
Enter 'help' for a list of built-in commands.

/bin/sh: can't access tty: Job control turned off
(initramfs)

At that point, the system is dead. I have tried bisecting this issue; however, I 
run into a second problem that crashes the bootstrap.


I finally finished the bisection by patching each commit that was affected by 
the bootstrap crash. The faulty change is commit 
3448890c32c32c482c3ec20baa8fdd2ab4f94cc0 ("powerpc: get rid of zeroing, switch 
to RAW_COPY_USER"). I am very confident in the bisection.


As I know nothing of assembly for ppc32, I have not been able to attempt to find 
the problem with these patches.


Larry