Re: in_compat_syscall() returns from kernel thread for X86_32.

2019-01-11 Thread Pavel Machek
On Wed 2018-10-24 10:32:37, Theodore Y. Ts'o wrote:
> On Wed, Oct 24, 2018 at 09:15:34AM -0400, Theodore Y. Ts'o wrote:
> > At least for ext4, the primary problem is that we want to use a 64-bit
> > telldir/seekdir cookie if all 64-bits will make it to user space, and
> > a 32-bit telldir cookie if only 32 bits will make it userspace.  This
> > impacts NFS as well because if there are people who are still using
> > NFSv2, which has 32-bit directory offsets, we need to constrain the
> > telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
> > 64-bit hash.
> 
> Are there anyone still using NFSv2, BTW?  One way of making the
> problem *much* easier to sovle would be to drop NFSv2 support.  :-)

NFSv2 is now in "unexcpected" places such as U-Boot bootloader. I'm
pretty sure someone still uses it...

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Andy Lutomirski



> On Oct 24, 2018, at 8:46 PM, NeilBrown  wrote:
> 
>> On Wed, Oct 24 2018, Theodore Y. Ts'o wrote:
>> 
>>> On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
>>> 
>>> I doubt it was copied - more likely independent evolution.
>>> But on reflection, I see that it is probably reasonable that it
>>> shouldn't be used this way - or at all in this context.
>>> I'll try to understand what the issues really are and see if I can
>>> find a solution that doesn't depend on this interface.
>>> Thanks for your help.
>> 
>> At least for ext4, the primary problem is that we want to use a 64-bit
>> telldir/seekdir cookie if all 64-bits will make it to user space, and
>> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
>> impacts NFS as well because if there are people who are still using
>> NFSv2, which has 32-bit directory offsets, we need to constrain the
>> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
>> 64-bit hash.
> 
> NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the
> right thing.  FMODE_32BITHASH is set for NFSv2 only.
> 
> Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs
> to set FMODE_64BITHASH - or something like that.

It’s possible for a 32-bit process and a 64-bit process to share a directory 
fd, so I don’t think it’s quite that simple.

One option would be to add .llseek and .getdents flags or entire new compat 
operations to indicate that the caller expects 32-bit offsets.

I wonder how overlayfs interacts with this whole mess.

Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Andy Lutomirski



> On Oct 24, 2018, at 8:46 PM, NeilBrown  wrote:
> 
>> On Wed, Oct 24 2018, Theodore Y. Ts'o wrote:
>> 
>>> On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
>>> 
>>> I doubt it was copied - more likely independent evolution.
>>> But on reflection, I see that it is probably reasonable that it
>>> shouldn't be used this way - or at all in this context.
>>> I'll try to understand what the issues really are and see if I can
>>> find a solution that doesn't depend on this interface.
>>> Thanks for your help.
>> 
>> At least for ext4, the primary problem is that we want to use a 64-bit
>> telldir/seekdir cookie if all 64-bits will make it to user space, and
>> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
>> impacts NFS as well because if there are people who are still using
>> NFSv2, which has 32-bit directory offsets, we need to constrain the
>> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
>> 64-bit hash.
> 
> NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the
> right thing.  FMODE_32BITHASH is set for NFSv2 only.
> 
> Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs
> to set FMODE_64BITHASH - or something like that.

It’s possible for a 32-bit process and a 64-bit process to share a directory 
fd, so I don’t think it’s quite that simple.

One option would be to add .llseek and .getdents flags or entire new compat 
operations to indicate that the caller expects 32-bit offsets.

I wonder how overlayfs interacts with this whole mess.

Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread NeilBrown
On Wed, Oct 24 2018, Theodore Y. Ts'o wrote:

> On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
>> 
>> I doubt it was copied - more likely independent evolution.
>> But on reflection, I see that it is probably reasonable that it
>> shouldn't be used this way - or at all in this context.
>> I'll try to understand what the issues really are and see if I can
>> find a solution that doesn't depend on this interface.
>> Thanks for your help.
>
> At least for ext4, the primary problem is that we want to use a 64-bit
> telldir/seekdir cookie if all 64-bits will make it to user space, and
> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
> impacts NFS as well because if there are people who are still using
> NFSv2, which has 32-bit directory offsets, we need to constrain the
> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
> 64-bit hash.

NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the
right thing.  FMODE_32BITHASH is set for NFSv2 only.

Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs
to set FMODE_64BITHASH - or something like that.

For lustre it is a bit more complex.  The internal "inode number" is 128
bits and we (sort of) hash it to 32 or 64 bits.  cp_compat_stat() just
says -EOVERFLOW if we give a 64 bit number when 32 are expected, and
there is no flag to say "this is a 32-bit 'stat' request".

But I need to dig into exactly what that "sort-of" means - maybe there
is an answer in there.

NeilBrown


signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread NeilBrown
On Wed, Oct 24 2018, Theodore Y. Ts'o wrote:

> On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
>> 
>> I doubt it was copied - more likely independent evolution.
>> But on reflection, I see that it is probably reasonable that it
>> shouldn't be used this way - or at all in this context.
>> I'll try to understand what the issues really are and see if I can
>> find a solution that doesn't depend on this interface.
>> Thanks for your help.
>
> At least for ext4, the primary problem is that we want to use a 64-bit
> telldir/seekdir cookie if all 64-bits will make it to user space, and
> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
> impacts NFS as well because if there are people who are still using
> NFSv2, which has 32-bit directory offsets, we need to constrain the
> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
> 64-bit hash.

NFSd uses FMODE_32BITHASH or FMODE64BITHASH to allow ext4 to do the
right thing.  FMODE_32BITHASH is set for NFSv2 only.

Maybe sys_getdents needs to set FMODE_32BITHASH, and sys_getdent64 needs
to set FMODE_64BITHASH - or something like that.

For lustre it is a bit more complex.  The internal "inode number" is 128
bits and we (sort of) hash it to 32 or 64 bits.  cp_compat_stat() just
says -EOVERFLOW if we give a 64 bit number when 32 are expected, and
there is no flag to say "this is a 32-bit 'stat' request".

But I need to dig into exactly what that "sort-of" means - maybe there
is an answer in there.

NeilBrown


signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Theodore Y. Ts'o
On Wed, Oct 24, 2018 at 09:15:34AM -0400, Theodore Y. Ts'o wrote:
> At least for ext4, the primary problem is that we want to use a 64-bit
> telldir/seekdir cookie if all 64-bits will make it to user space, and
> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
> impacts NFS as well because if there are people who are still using
> NFSv2, which has 32-bit directory offsets, we need to constrain the
> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
> 64-bit hash.

Are there anyone still using NFSv2, BTW?  One way of making the
problem *much* easier to sovle would be to drop NFSv2 support.  :-)

 - Ted


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Theodore Y. Ts'o
On Wed, Oct 24, 2018 at 09:15:34AM -0400, Theodore Y. Ts'o wrote:
> At least for ext4, the primary problem is that we want to use a 64-bit
> telldir/seekdir cookie if all 64-bits will make it to user space, and
> a 32-bit telldir cookie if only 32 bits will make it userspace.  This
> impacts NFS as well because if there are people who are still using
> NFSv2, which has 32-bit directory offsets, we need to constrain the
> telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
> 64-bit hash.

Are there anyone still using NFSv2, BTW?  One way of making the
problem *much* easier to sovle would be to drop NFSv2 support.  :-)

 - Ted


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Theodore Y. Ts'o
On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
> 
> I doubt it was copied - more likely independent evolution.
> But on reflection, I see that it is probably reasonable that it
> shouldn't be used this way - or at all in this context.
> I'll try to understand what the issues really are and see if I can
> find a solution that doesn't depend on this interface.
> Thanks for your help.

At least for ext4, the primary problem is that we want to use a 64-bit
telldir/seekdir cookie if all 64-bits will make it to user space, and
a 32-bit telldir cookie if only 32 bits will make it userspace.  This
impacts NFS as well because if there are people who are still using
NFSv2, which has 32-bit directory offsets, we need to constrain the
telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
64-bit hash.

- Ted


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-24 Thread Theodore Y. Ts'o
On Wed, Oct 24, 2018 at 12:47:57PM +1100, NeilBrown wrote:
> 
> I doubt it was copied - more likely independent evolution.
> But on reflection, I see that it is probably reasonable that it
> shouldn't be used this way - or at all in this context.
> I'll try to understand what the issues really are and see if I can
> find a solution that doesn't depend on this interface.
> Thanks for your help.

At least for ext4, the primary problem is that we want to use a 64-bit
telldir/seekdir cookie if all 64-bits will make it to user space, and
a 32-bit telldir cookie if only 32 bits will make it userspace.  This
impacts NFS as well because if there are people who are still using
NFSv2, which has 32-bit directory offsets, we need to constrain the
telldir/seekdir cookies we give to NFS to be a 32 has as opposed to a
64-bit hash.

- Ted


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-23 Thread NeilBrown
On Thu, Oct 18 2018, Andy Lutomirski wrote:

> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>>
>> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>>
>> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>> >>
>> >>
>> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
>> >> in_{ia32,x32}_syscall()
>> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>> >>
>> >> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Gitweb: 
>> >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Author: Dmitry Safonov 
>> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> >> > Committer:  Ingo Molnar 
>> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> >> >
>> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >> >
>> >> ...
>> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> >> >
>> >> >  static inline bool in_compat_syscall(void)
>> >> >  {
>> >> > - return is_ia32_task() || is_x32_task();
>> >> > + return in_ia32_syscall() || in_x32_syscall();
>> >> >  }
>> >>
>> >> Hi,
>> >>  I'm reply to this patch largely to make sure I get the right people
>> >>  .
>> >>
>> >> This test is always true when CONFIG_X86_32 is set, as that forces
>> >> in_ia32_syscall() to true.
>> >> However we might not be in a syscall at all - we might be running a
>> >> kernel thread which is always in 64 mode.
>> >> Every other implementation of in_compat_syscall() that I found is
>> >> dependant on a thread flag or syscall register flag, and so returns
>> >> "false" in a kernel thread.
>> >>
>> >> Might something like this be appropriate?
>> >>
>> >> diff --git a/arch/x86/include/asm/thread_info.h 
>> >> b/arch/x86/include/asm/thread_info.h
>> >> index 2ff2a30a264f..c265b40a78f2 100644
>> >> --- a/arch/x86/include/asm/thread_info.h
>> >> +++ b/arch/x86/include/asm/thread_info.h
>> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
>> >> * const stack,
>> >>  #ifndef __ASSEMBLY__
>> >>
>> >>  #ifdef CONFIG_X86_32
>> >> -#define in_ia32_syscall() true
>> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>> >>  #else
>> >>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>> >>current_thread_info()->status & TS_COMPAT)
>> >>
>> >> This came up in the (no out-of-tree) lustre filesystem where some code
>> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
>> >> threads.
>> >>
>> >
>> > I could get on board with:
>> >
>> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>> >
>> > The point of these accessors is to be used *in a syscall*.
>> >
>> > What on Earth is Lustre doing that makes it have this problem?
>>
>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>> ->rdev are appropriately sized.  This isn't very different from the
>> usage in ext4 to ensure the seek offset for directories is suitable.
>>
>> These interfaces can be used both from systemcalls and from kernel
>> threads, such as via nfsd.
>>
>> I don't *know* if nfsd is the particular kthread that causes problems
>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>> numbers in kthread context where 64 bit numbers would be expected.
>>
>
> Well, that looks like Lustre is copying an ext4 bug.

I doubt it was copied - more likely independent evolution.
But on reflection, I see that it is probably reasonable that it
shouldn't be used this way - or at all in this context.
I'll try to understand what the issues really are and see if I can
find a solution that doesn't depend on this interface.
Thanks for your help.

NeilBrown


>
> Hi ext4 people-
>
> ext4's is_32bit_api() function is bogus.  You can't use
> in_compat_syscall() unless you know you're in a syscall
>
> The buggy code was introduced in:
>
> commit d1f5273e9adb40724a85272f248f210dc4ce919a
> Author: Fan Yong 
> Date:   Sun Mar 18 22:44:40 2012 -0400
>
> ext4: return 32/64-bit dir name hash according to usage type
>
> I don't know what the right solution is.  Al, is it legit at all for
> fops->llseek to care about the caller's bitness?  If what ext4 is
> doing is legit, then ISTM the VFS needs to gain a new API to tell
> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
> isn't sufficient,
>
> I'm quite tempted to add a warning to the x86 arch code to try to
> catch this type of bug.  Fortunately, a bit of grepping suggests that
> ext4 is the only filesystem with this problem.
>
> --Andy


signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-23 Thread NeilBrown
On Thu, Oct 18 2018, Andy Lutomirski wrote:

> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>>
>> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>>
>> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>> >>
>> >>
>> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
>> >> in_{ia32,x32}_syscall()
>> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>> >>
>> >> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Gitweb: 
>> >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Author: Dmitry Safonov 
>> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> >> > Committer:  Ingo Molnar 
>> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> >> >
>> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >> >
>> >> ...
>> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> >> >
>> >> >  static inline bool in_compat_syscall(void)
>> >> >  {
>> >> > - return is_ia32_task() || is_x32_task();
>> >> > + return in_ia32_syscall() || in_x32_syscall();
>> >> >  }
>> >>
>> >> Hi,
>> >>  I'm reply to this patch largely to make sure I get the right people
>> >>  .
>> >>
>> >> This test is always true when CONFIG_X86_32 is set, as that forces
>> >> in_ia32_syscall() to true.
>> >> However we might not be in a syscall at all - we might be running a
>> >> kernel thread which is always in 64 mode.
>> >> Every other implementation of in_compat_syscall() that I found is
>> >> dependant on a thread flag or syscall register flag, and so returns
>> >> "false" in a kernel thread.
>> >>
>> >> Might something like this be appropriate?
>> >>
>> >> diff --git a/arch/x86/include/asm/thread_info.h 
>> >> b/arch/x86/include/asm/thread_info.h
>> >> index 2ff2a30a264f..c265b40a78f2 100644
>> >> --- a/arch/x86/include/asm/thread_info.h
>> >> +++ b/arch/x86/include/asm/thread_info.h
>> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
>> >> * const stack,
>> >>  #ifndef __ASSEMBLY__
>> >>
>> >>  #ifdef CONFIG_X86_32
>> >> -#define in_ia32_syscall() true
>> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>> >>  #else
>> >>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>> >>current_thread_info()->status & TS_COMPAT)
>> >>
>> >> This came up in the (no out-of-tree) lustre filesystem where some code
>> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
>> >> threads.
>> >>
>> >
>> > I could get on board with:
>> >
>> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>> >
>> > The point of these accessors is to be used *in a syscall*.
>> >
>> > What on Earth is Lustre doing that makes it have this problem?
>>
>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>> ->rdev are appropriately sized.  This isn't very different from the
>> usage in ext4 to ensure the seek offset for directories is suitable.
>>
>> These interfaces can be used both from systemcalls and from kernel
>> threads, such as via nfsd.
>>
>> I don't *know* if nfsd is the particular kthread that causes problems
>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>> numbers in kthread context where 64 bit numbers would be expected.
>>
>
> Well, that looks like Lustre is copying an ext4 bug.

I doubt it was copied - more likely independent evolution.
But on reflection, I see that it is probably reasonable that it
shouldn't be used this way - or at all in this context.
I'll try to understand what the issues really are and see if I can
find a solution that doesn't depend on this interface.
Thanks for your help.

NeilBrown


>
> Hi ext4 people-
>
> ext4's is_32bit_api() function is bogus.  You can't use
> in_compat_syscall() unless you know you're in a syscall
>
> The buggy code was introduced in:
>
> commit d1f5273e9adb40724a85272f248f210dc4ce919a
> Author: Fan Yong 
> Date:   Sun Mar 18 22:44:40 2012 -0400
>
> ext4: return 32/64-bit dir name hash according to usage type
>
> I don't know what the right solution is.  Al, is it legit at all for
> fops->llseek to care about the caller's bitness?  If what ext4 is
> doing is legit, then ISTM the VFS needs to gain a new API to tell
> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
> isn't sufficient,
>
> I'm quite tempted to add a warning to the x86 arch code to try to
> catch this type of bug.  Fortunately, a bit of grepping suggests that
> ext4 is the only filesystem with this problem.
>
> --Andy


signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-20 Thread Andy Lutomirski



> On Oct 19, 2018, at 11:02 PM, Andreas Dilger  wrote:
> 
>> On Oct 18, 2018, at 11:26 AM, Andy Lutomirski  wrote:
>> 
>>> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>>> 
 On Wed, Oct 17 2018, Andy Lutomirski wrote:
 
> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
> 
> 
> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> in_{ia32,x32}_syscall()
>> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>> 
>> Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> Gitweb: 
>> http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> Author: Dmitry Safonov 
>> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> Committer:  Ingo Molnar 
>> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> 
>> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> 
> ...
>> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> 
>> static inline bool in_compat_syscall(void)
>> {
>> - return is_ia32_task() || is_x32_task();
>> + return in_ia32_syscall() || in_x32_syscall();
>> }
> 
> Hi,
> I'm reply to this patch largely to make sure I get the right people
> .
> 
> This test is always true when CONFIG_X86_32 is set, as that forces
> in_ia32_syscall() to true.
> However we might not be in a syscall at all - we might be running a
> kernel thread which is always in 64 mode.
> Every other implementation of in_compat_syscall() that I found is
> dependant on a thread flag or syscall register flag, and so returns
> "false" in a kernel thread.
> 
> Might something like this be appropriate?
> 
> diff --git a/arch/x86/include/asm/thread_info.h 
> b/arch/x86/include/asm/thread_info.h
> index 2ff2a30a264f..c265b40a78f2 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
> * const stack,
> #ifndef __ASSEMBLY__
> 
> #ifdef CONFIG_X86_32
> -#define in_ia32_syscall() true
> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
> #else
> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>  current_thread_info()->status & TS_COMPAT)
> 
> This came up in the (no out-of-tree) lustre filesystem where some code
> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> threads.
> 
 
 I could get on board with:
 
 ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
 
 The point of these accessors is to be used *in a syscall*.
 
 What on Earth is Lustre doing that makes it have this problem?
>>> 
>>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>>> ->rdev are appropriately sized.  This isn't very different from the
>>> usage in ext4 to ensure the seek offset for directories is suitable.
>>> 
>>> These interfaces can be used both from systemcalls and from kernel
>>> threads, such as via nfsd.
>>> 
>>> I don't *know* if nfsd is the particular kthread that causes problems
>>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>>> numbers in kthread context where 64 bit numbers would be expected.
>>> 
>> 
>> Well, that looks like Lustre is copying an ext4 bug.
>> 
>> Hi ext4 people-
>> 
>> ext4's is_32bit_api() function is bogus.  You can't use
>> in_compat_syscall() unless you know you're in a syscall
>> 
>> The buggy code was introduced in:
>> 
>> commit d1f5273e9adb40724a85272f248f210dc4ce919a
>> Author: Fan Yong 
>> Date:   Sun Mar 18 22:44:40 2012 -0400
>> 
>>   ext4: return 32/64-bit dir name hash according to usage type
>> 
>> I don't know what the right solution is.  Al, is it legit at all for
>> fops->llseek to care about the caller's bitness?  If what ext4 is
>> doing is legit, then ISTM the VFS needs to gain a new API to tell
>> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
>> isn't sufficient,
>> 
>> I'm quite tempted to add a warning to the x86 arch code to try to
>> catch this type of bug.  Fortunately, a bit of grepping suggests that
>> ext4 is the only filesystem with this problem.
> 
> We need to know whether the readdir cookie returned to userspace
> should be a 32-bit cookie or a 64-bit cookie.  Trying to return
> a 64-bit value will result in -EOVERFLOW for a 32-bit application,
> but is preferable (if possible) because it reduces the chance of
> hash collisions causing readdir to have problems.
> 
> 

Let’s rope Al in. Sorry, I thought he was already on cc.

The concept seems reasonable, but the implementation is problematic. For 
example, the behavior of calling vfs_llseek() is basically undefined.

Is some VFS change needed to fix this?  Maybe a .compat_llseek or some other 
explicit indication of whether a 64-bit hash is okay?

Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-20 Thread Andy Lutomirski



> On Oct 19, 2018, at 11:02 PM, Andreas Dilger  wrote:
> 
>> On Oct 18, 2018, at 11:26 AM, Andy Lutomirski  wrote:
>> 
>>> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>>> 
 On Wed, Oct 17 2018, Andy Lutomirski wrote:
 
> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
> 
> 
> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> in_{ia32,x32}_syscall()
>> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>> 
>> Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> Gitweb: 
>> http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> Author: Dmitry Safonov 
>> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> Committer:  Ingo Molnar 
>> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> 
>> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> 
> ...
>> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> 
>> static inline bool in_compat_syscall(void)
>> {
>> - return is_ia32_task() || is_x32_task();
>> + return in_ia32_syscall() || in_x32_syscall();
>> }
> 
> Hi,
> I'm reply to this patch largely to make sure I get the right people
> .
> 
> This test is always true when CONFIG_X86_32 is set, as that forces
> in_ia32_syscall() to true.
> However we might not be in a syscall at all - we might be running a
> kernel thread which is always in 64 mode.
> Every other implementation of in_compat_syscall() that I found is
> dependant on a thread flag or syscall register flag, and so returns
> "false" in a kernel thread.
> 
> Might something like this be appropriate?
> 
> diff --git a/arch/x86/include/asm/thread_info.h 
> b/arch/x86/include/asm/thread_info.h
> index 2ff2a30a264f..c265b40a78f2 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
> * const stack,
> #ifndef __ASSEMBLY__
> 
> #ifdef CONFIG_X86_32
> -#define in_ia32_syscall() true
> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
> #else
> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>  current_thread_info()->status & TS_COMPAT)
> 
> This came up in the (no out-of-tree) lustre filesystem where some code
> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> threads.
> 
 
 I could get on board with:
 
 ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
 
 The point of these accessors is to be used *in a syscall*.
 
 What on Earth is Lustre doing that makes it have this problem?
>>> 
>>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>>> ->rdev are appropriately sized.  This isn't very different from the
>>> usage in ext4 to ensure the seek offset for directories is suitable.
>>> 
>>> These interfaces can be used both from systemcalls and from kernel
>>> threads, such as via nfsd.
>>> 
>>> I don't *know* if nfsd is the particular kthread that causes problems
>>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>>> numbers in kthread context where 64 bit numbers would be expected.
>>> 
>> 
>> Well, that looks like Lustre is copying an ext4 bug.
>> 
>> Hi ext4 people-
>> 
>> ext4's is_32bit_api() function is bogus.  You can't use
>> in_compat_syscall() unless you know you're in a syscall
>> 
>> The buggy code was introduced in:
>> 
>> commit d1f5273e9adb40724a85272f248f210dc4ce919a
>> Author: Fan Yong 
>> Date:   Sun Mar 18 22:44:40 2012 -0400
>> 
>>   ext4: return 32/64-bit dir name hash according to usage type
>> 
>> I don't know what the right solution is.  Al, is it legit at all for
>> fops->llseek to care about the caller's bitness?  If what ext4 is
>> doing is legit, then ISTM the VFS needs to gain a new API to tell
>> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
>> isn't sufficient,
>> 
>> I'm quite tempted to add a warning to the x86 arch code to try to
>> catch this type of bug.  Fortunately, a bit of grepping suggests that
>> ext4 is the only filesystem with this problem.
> 
> We need to know whether the readdir cookie returned to userspace
> should be a 32-bit cookie or a 64-bit cookie.  Trying to return
> a 64-bit value will result in -EOVERFLOW for a 32-bit application,
> but is preferable (if possible) because it reduces the chance of
> hash collisions causing readdir to have problems.
> 
> 

Let’s rope Al in. Sorry, I thought he was already on cc.

The concept seems reasonable, but the implementation is problematic. For 
example, the behavior of calling vfs_llseek() is basically undefined.

Is some VFS change needed to fix this?  Maybe a .compat_llseek or some other 
explicit indication of whether a 64-bit hash is okay?

Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-20 Thread Andreas Dilger
On Oct 18, 2018, at 11:26 AM, Andy Lutomirski  wrote:
> 
> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>> 
>> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>> 
>>> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
 
 
 Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
 in_{ia32,x32}_syscall()
 On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
 
> Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> Gitweb: 
> http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> Author: Dmitry Safonov 
> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> Committer:  Ingo Molnar 
> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> 
> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> 
 ...
> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> 
> static inline bool in_compat_syscall(void)
> {
> - return is_ia32_task() || is_x32_task();
> + return in_ia32_syscall() || in_x32_syscall();
> }
 
 Hi,
 I'm reply to this patch largely to make sure I get the right people
 .
 
 This test is always true when CONFIG_X86_32 is set, as that forces
 in_ia32_syscall() to true.
 However we might not be in a syscall at all - we might be running a
 kernel thread which is always in 64 mode.
 Every other implementation of in_compat_syscall() that I found is
 dependant on a thread flag or syscall register flag, and so returns
 "false" in a kernel thread.
 
 Might something like this be appropriate?
 
 diff --git a/arch/x86/include/asm/thread_info.h 
 b/arch/x86/include/asm/thread_info.h
 index 2ff2a30a264f..c265b40a78f2 100644
 --- a/arch/x86/include/asm/thread_info.h
 +++ b/arch/x86/include/asm/thread_info.h
 @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
 * const stack,
 #ifndef __ASSEMBLY__
 
 #ifdef CONFIG_X86_32
 -#define in_ia32_syscall() true
 +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
 #else
 #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
   current_thread_info()->status & TS_COMPAT)
 
 This came up in the (no out-of-tree) lustre filesystem where some code
 needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
 threads.
 
>>> 
>>> I could get on board with:
>>> 
>>> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>>> 
>>> The point of these accessors is to be used *in a syscall*.
>>> 
>>> What on Earth is Lustre doing that makes it have this problem?
>> 
>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>> ->rdev are appropriately sized.  This isn't very different from the
>> usage in ext4 to ensure the seek offset for directories is suitable.
>> 
>> These interfaces can be used both from systemcalls and from kernel
>> threads, such as via nfsd.
>> 
>> I don't *know* if nfsd is the particular kthread that causes problems
>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>> numbers in kthread context where 64 bit numbers would be expected.
>> 
> 
> Well, that looks like Lustre is copying an ext4 bug.
> 
> Hi ext4 people-
> 
> ext4's is_32bit_api() function is bogus.  You can't use
> in_compat_syscall() unless you know you're in a syscall
> 
> The buggy code was introduced in:
> 
> commit d1f5273e9adb40724a85272f248f210dc4ce919a
> Author: Fan Yong 
> Date:   Sun Mar 18 22:44:40 2012 -0400
> 
>ext4: return 32/64-bit dir name hash according to usage type
> 
> I don't know what the right solution is.  Al, is it legit at all for
> fops->llseek to care about the caller's bitness?  If what ext4 is
> doing is legit, then ISTM the VFS needs to gain a new API to tell
> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
> isn't sufficient,
> 
> I'm quite tempted to add a warning to the x86 arch code to try to
> catch this type of bug.  Fortunately, a bit of grepping suggests that
> ext4 is the only filesystem with this problem.

We need to know whether the readdir cookie returned to userspace
should be a 32-bit cookie or a 64-bit cookie.  Trying to return
a 64-bit value will result in -EOVERFLOW for a 32-bit application,
but is preferable (if possible) because it reduces the chance of
hash collisions causing readdir to have problems.

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-20 Thread Andreas Dilger
On Oct 18, 2018, at 11:26 AM, Andy Lutomirski  wrote:
> 
> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>> 
>> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>> 
>>> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
 
 
 Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
 in_{ia32,x32}_syscall()
 On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
 
> Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> Gitweb: 
> http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> Author: Dmitry Safonov 
> AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> Committer:  Ingo Molnar 
> CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> 
> x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> 
 ...
> @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> 
> static inline bool in_compat_syscall(void)
> {
> - return is_ia32_task() || is_x32_task();
> + return in_ia32_syscall() || in_x32_syscall();
> }
 
 Hi,
 I'm reply to this patch largely to make sure I get the right people
 .
 
 This test is always true when CONFIG_X86_32 is set, as that forces
 in_ia32_syscall() to true.
 However we might not be in a syscall at all - we might be running a
 kernel thread which is always in 64 mode.
 Every other implementation of in_compat_syscall() that I found is
 dependant on a thread flag or syscall register flag, and so returns
 "false" in a kernel thread.
 
 Might something like this be appropriate?
 
 diff --git a/arch/x86/include/asm/thread_info.h 
 b/arch/x86/include/asm/thread_info.h
 index 2ff2a30a264f..c265b40a78f2 100644
 --- a/arch/x86/include/asm/thread_info.h
 +++ b/arch/x86/include/asm/thread_info.h
 @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
 * const stack,
 #ifndef __ASSEMBLY__
 
 #ifdef CONFIG_X86_32
 -#define in_ia32_syscall() true
 +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
 #else
 #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
   current_thread_info()->status & TS_COMPAT)
 
 This came up in the (no out-of-tree) lustre filesystem where some code
 needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
 threads.
 
>>> 
>>> I could get on board with:
>>> 
>>> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>>> 
>>> The point of these accessors is to be used *in a syscall*.
>>> 
>>> What on Earth is Lustre doing that makes it have this problem?
>> 
>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>> ->rdev are appropriately sized.  This isn't very different from the
>> usage in ext4 to ensure the seek offset for directories is suitable.
>> 
>> These interfaces can be used both from systemcalls and from kernel
>> threads, such as via nfsd.
>> 
>> I don't *know* if nfsd is the particular kthread that causes problems
>> for lustre.  All I know is that ->getattr returns 32bit squashed inode
>> numbers in kthread context where 64 bit numbers would be expected.
>> 
> 
> Well, that looks like Lustre is copying an ext4 bug.
> 
> Hi ext4 people-
> 
> ext4's is_32bit_api() function is bogus.  You can't use
> in_compat_syscall() unless you know you're in a syscall
> 
> The buggy code was introduced in:
> 
> commit d1f5273e9adb40724a85272f248f210dc4ce919a
> Author: Fan Yong 
> Date:   Sun Mar 18 22:44:40 2012 -0400
> 
>ext4: return 32/64-bit dir name hash according to usage type
> 
> I don't know what the right solution is.  Al, is it legit at all for
> fops->llseek to care about the caller's bitness?  If what ext4 is
> doing is legit, then ISTM the VFS needs to gain a new API to tell
> ->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
> isn't sufficient,
> 
> I'm quite tempted to add a warning to the x86 arch code to try to
> catch this type of bug.  Fortunately, a bit of grepping suggests that
> ext4 is the only filesystem with this problem.

We need to know whether the readdir cookie returned to userspace
should be a 32-bit cookie or a 64-bit cookie.  Trying to return
a 64-bit value will result in -EOVERFLOW for a 32-bit application,
but is preferable (if possible) because it reduces the chance of
hash collisions causing readdir to have problems.

Cheers, Andreas







signature.asc
Description: Message signed with OpenPGP


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-18 Thread Andy Lutomirski
On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>
> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>
> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
> >>
> >>
> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> >> in_{ia32,x32}_syscall()
> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
> >>
> >> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> >> > Gitweb: 
> >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> >> > Author: Dmitry Safonov 
> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> >> > Committer:  Ingo Molnar 
> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> >> >
> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> >> >
> >> ...
> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> >> >
> >> >  static inline bool in_compat_syscall(void)
> >> >  {
> >> > - return is_ia32_task() || is_x32_task();
> >> > + return in_ia32_syscall() || in_x32_syscall();
> >> >  }
> >>
> >> Hi,
> >>  I'm reply to this patch largely to make sure I get the right people
> >>  .
> >>
> >> This test is always true when CONFIG_X86_32 is set, as that forces
> >> in_ia32_syscall() to true.
> >> However we might not be in a syscall at all - we might be running a
> >> kernel thread which is always in 64 mode.
> >> Every other implementation of in_compat_syscall() that I found is
> >> dependant on a thread flag or syscall register flag, and so returns
> >> "false" in a kernel thread.
> >>
> >> Might something like this be appropriate?
> >>
> >> diff --git a/arch/x86/include/asm/thread_info.h 
> >> b/arch/x86/include/asm/thread_info.h
> >> index 2ff2a30a264f..c265b40a78f2 100644
> >> --- a/arch/x86/include/asm/thread_info.h
> >> +++ b/arch/x86/include/asm/thread_info.h
> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
> >> * const stack,
> >>  #ifndef __ASSEMBLY__
> >>
> >>  #ifdef CONFIG_X86_32
> >> -#define in_ia32_syscall() true
> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
> >>  #else
> >>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
> >>current_thread_info()->status & TS_COMPAT)
> >>
> >> This came up in the (no out-of-tree) lustre filesystem where some code
> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> >> threads.
> >>
> >
> > I could get on board with:
> >
> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
> >
> > The point of these accessors is to be used *in a syscall*.
> >
> > What on Earth is Lustre doing that makes it have this problem?
>
> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
> ->rdev are appropriately sized.  This isn't very different from the
> usage in ext4 to ensure the seek offset for directories is suitable.
>
> These interfaces can be used both from systemcalls and from kernel
> threads, such as via nfsd.
>
> I don't *know* if nfsd is the particular kthread that causes problems
> for lustre.  All I know is that ->getattr returns 32bit squashed inode
> numbers in kthread context where 64 bit numbers would be expected.
>

Well, that looks like Lustre is copying an ext4 bug.

Hi ext4 people-

ext4's is_32bit_api() function is bogus.  You can't use
in_compat_syscall() unless you know you're in a syscall

The buggy code was introduced in:

commit d1f5273e9adb40724a85272f248f210dc4ce919a
Author: Fan Yong 
Date:   Sun Mar 18 22:44:40 2012 -0400

ext4: return 32/64-bit dir name hash according to usage type

I don't know what the right solution is.  Al, is it legit at all for
fops->llseek to care about the caller's bitness?  If what ext4 is
doing is legit, then ISTM the VFS needs to gain a new API to tell
->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
isn't sufficient,

I'm quite tempted to add a warning to the x86 arch code to try to
catch this type of bug.  Fortunately, a bit of grepping suggests that
ext4 is the only filesystem with this problem.

--Andy


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-18 Thread Andy Lutomirski
On Wed, Oct 17, 2018 at 9:36 PM NeilBrown  wrote:
>
> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>
> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
> >>
> >>
> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> >> in_{ia32,x32}_syscall()
> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
> >>
> >> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> >> > Gitweb: 
> >> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> >> > Author: Dmitry Safonov 
> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> >> > Committer:  Ingo Molnar 
> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> >> >
> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> >> >
> >> ...
> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> >> >
> >> >  static inline bool in_compat_syscall(void)
> >> >  {
> >> > - return is_ia32_task() || is_x32_task();
> >> > + return in_ia32_syscall() || in_x32_syscall();
> >> >  }
> >>
> >> Hi,
> >>  I'm reply to this patch largely to make sure I get the right people
> >>  .
> >>
> >> This test is always true when CONFIG_X86_32 is set, as that forces
> >> in_ia32_syscall() to true.
> >> However we might not be in a syscall at all - we might be running a
> >> kernel thread which is always in 64 mode.
> >> Every other implementation of in_compat_syscall() that I found is
> >> dependant on a thread flag or syscall register flag, and so returns
> >> "false" in a kernel thread.
> >>
> >> Might something like this be appropriate?
> >>
> >> diff --git a/arch/x86/include/asm/thread_info.h 
> >> b/arch/x86/include/asm/thread_info.h
> >> index 2ff2a30a264f..c265b40a78f2 100644
> >> --- a/arch/x86/include/asm/thread_info.h
> >> +++ b/arch/x86/include/asm/thread_info.h
> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void 
> >> * const stack,
> >>  #ifndef __ASSEMBLY__
> >>
> >>  #ifdef CONFIG_X86_32
> >> -#define in_ia32_syscall() true
> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
> >>  #else
> >>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
> >>current_thread_info()->status & TS_COMPAT)
> >>
> >> This came up in the (no out-of-tree) lustre filesystem where some code
> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> >> threads.
> >>
> >
> > I could get on board with:
> >
> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
> >
> > The point of these accessors is to be used *in a syscall*.
> >
> > What on Earth is Lustre doing that makes it have this problem?
>
> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
> ->rdev are appropriately sized.  This isn't very different from the
> usage in ext4 to ensure the seek offset for directories is suitable.
>
> These interfaces can be used both from systemcalls and from kernel
> threads, such as via nfsd.
>
> I don't *know* if nfsd is the particular kthread that causes problems
> for lustre.  All I know is that ->getattr returns 32bit squashed inode
> numbers in kthread context where 64 bit numbers would be expected.
>

Well, that looks like Lustre is copying an ext4 bug.

Hi ext4 people-

ext4's is_32bit_api() function is bogus.  You can't use
in_compat_syscall() unless you know you're in a syscall

The buggy code was introduced in:

commit d1f5273e9adb40724a85272f248f210dc4ce919a
Author: Fan Yong 
Date:   Sun Mar 18 22:44:40 2012 -0400

ext4: return 32/64-bit dir name hash according to usage type

I don't know what the right solution is.  Al, is it legit at all for
fops->llseek to care about the caller's bitness?  If what ext4 is
doing is legit, then ISTM the VFS needs to gain a new API to tell
->llseek what to do.  But I'm wondering why FMODE_64BITHASH by itself
isn't sufficient,

I'm quite tempted to add a warning to the x86 arch code to try to
catch this type of bug.  Fortunately, a bit of grepping suggests that
ext4 is the only filesystem with this problem.

--Andy


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread NeilBrown
On Wed, Oct 17 2018, Andy Lutomirski wrote:

> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>>
>>
>> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
>> in_{ia32,x32}_syscall()
>> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>>
>> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> > Gitweb: 
>> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> > Author: Dmitry Safonov 
>> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> > Committer:  Ingo Molnar 
>> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> >
>> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >
>> ...
>> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> >
>> >  static inline bool in_compat_syscall(void)
>> >  {
>> > - return is_ia32_task() || is_x32_task();
>> > + return in_ia32_syscall() || in_x32_syscall();
>> >  }
>>
>> Hi,
>>  I'm reply to this patch largely to make sure I get the right people
>>  .
>>
>> This test is always true when CONFIG_X86_32 is set, as that forces
>> in_ia32_syscall() to true.
>> However we might not be in a syscall at all - we might be running a
>> kernel thread which is always in 64 mode.
>> Every other implementation of in_compat_syscall() that I found is
>> dependant on a thread flag or syscall register flag, and so returns
>> "false" in a kernel thread.
>>
>> Might something like this be appropriate?
>>
>> diff --git a/arch/x86/include/asm/thread_info.h 
>> b/arch/x86/include/asm/thread_info.h
>> index 2ff2a30a264f..c265b40a78f2 100644
>> --- a/arch/x86/include/asm/thread_info.h
>> +++ b/arch/x86/include/asm/thread_info.h
>> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * 
>> const stack,
>>  #ifndef __ASSEMBLY__
>>
>>  #ifdef CONFIG_X86_32
>> -#define in_ia32_syscall() true
>> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>>  #else
>>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>>current_thread_info()->status & TS_COMPAT)
>>
>> This came up in the (no out-of-tree) lustre filesystem where some code
>> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
>> threads.
>>
>
> I could get on board with:
>
> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>
> The point of these accessors is to be used *in a syscall*.
>
> What on Earth is Lustre doing that makes it have this problem?

Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
->rdev are appropriately sized.  This isn't very different from the
usage in ext4 to ensure the seek offset for directories is suitable.

These interfaces can be used both from systemcalls and from kernel
threads, such as via nfsd.

I don't *know* if nfsd is the particular kthread that causes problems
for lustre.  All I know is that ->getattr returns 32bit squashed inode
numbers in kthread context where 64 bit numbers would be expected.

Thanks,
NeilBrown



signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread NeilBrown
On Wed, Oct 17 2018, Andy Lutomirski wrote:

> On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>>
>>
>> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
>> in_{ia32,x32}_syscall()
>> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>>
>> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
>> > Gitweb: 
>> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> > Author: Dmitry Safonov 
>> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> > Committer:  Ingo Molnar 
>> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> >
>> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >
>> ...
>> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> >
>> >  static inline bool in_compat_syscall(void)
>> >  {
>> > - return is_ia32_task() || is_x32_task();
>> > + return in_ia32_syscall() || in_x32_syscall();
>> >  }
>>
>> Hi,
>>  I'm reply to this patch largely to make sure I get the right people
>>  .
>>
>> This test is always true when CONFIG_X86_32 is set, as that forces
>> in_ia32_syscall() to true.
>> However we might not be in a syscall at all - we might be running a
>> kernel thread which is always in 64 mode.
>> Every other implementation of in_compat_syscall() that I found is
>> dependant on a thread flag or syscall register flag, and so returns
>> "false" in a kernel thread.
>>
>> Might something like this be appropriate?
>>
>> diff --git a/arch/x86/include/asm/thread_info.h 
>> b/arch/x86/include/asm/thread_info.h
>> index 2ff2a30a264f..c265b40a78f2 100644
>> --- a/arch/x86/include/asm/thread_info.h
>> +++ b/arch/x86/include/asm/thread_info.h
>> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * 
>> const stack,
>>  #ifndef __ASSEMBLY__
>>
>>  #ifdef CONFIG_X86_32
>> -#define in_ia32_syscall() true
>> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>>  #else
>>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>>current_thread_info()->status & TS_COMPAT)
>>
>> This came up in the (no out-of-tree) lustre filesystem where some code
>> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
>> threads.
>>
>
> I could get on board with:
>
> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>
> The point of these accessors is to be used *in a syscall*.
>
> What on Earth is Lustre doing that makes it have this problem?

Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
->rdev are appropriately sized.  This isn't very different from the
usage in ext4 to ensure the seek offset for directories is suitable.

These interfaces can be used both from systemcalls and from kernel
threads, such as via nfsd.

I don't *know* if nfsd is the particular kthread that causes problems
for lustre.  All I know is that ->getattr returns 32bit squashed inode
numbers in kthread context where 64 bit numbers would be expected.

Thanks,
NeilBrown



signature.asc
Description: PGP signature


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread Al Viro
On Wed, Oct 17, 2018 at 07:37:42PM -0700, Andy Lutomirski wrote:

> I could get on board with:
> 
> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
> 
> The point of these accessors is to be used *in a syscall*.
> 
> What on Earth is Lustre doing that makes it have this problem?

Plays with timestamps from a kernel thread, perhaps?
See the old .su joke about adenoidectomy with rectal access
for possible reasons for doing it that way...


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread Al Viro
On Wed, Oct 17, 2018 at 07:37:42PM -0700, Andy Lutomirski wrote:

> I could get on board with:
> 
> ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
> 
> The point of these accessors is to be used *in a syscall*.
> 
> What on Earth is Lustre doing that makes it have this problem?

Plays with timestamps from a kernel thread, perhaps?
See the old .su joke about adenoidectomy with rectal access
for possible reasons for doing it that way...


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread Andy Lutomirski
On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>
>
> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> in_{ia32,x32}_syscall()
> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>
> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> > Gitweb: 
> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> > Author: Dmitry Safonov 
> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> > Committer:  Ingo Molnar 
> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> >
> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> >
> ...
> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> >
> >  static inline bool in_compat_syscall(void)
> >  {
> > - return is_ia32_task() || is_x32_task();
> > + return in_ia32_syscall() || in_x32_syscall();
> >  }
>
> Hi,
>  I'm reply to this patch largely to make sure I get the right people
>  .
>
> This test is always true when CONFIG_X86_32 is set, as that forces
> in_ia32_syscall() to true.
> However we might not be in a syscall at all - we might be running a
> kernel thread which is always in 64 mode.
> Every other implementation of in_compat_syscall() that I found is
> dependant on a thread flag or syscall register flag, and so returns
> "false" in a kernel thread.
>
> Might something like this be appropriate?
>
> diff --git a/arch/x86/include/asm/thread_info.h 
> b/arch/x86/include/asm/thread_info.h
> index 2ff2a30a264f..c265b40a78f2 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * 
> const stack,
>  #ifndef __ASSEMBLY__
>
>  #ifdef CONFIG_X86_32
> -#define in_ia32_syscall() true
> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>  #else
>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>current_thread_info()->status & TS_COMPAT)
>
> This came up in the (no out-of-tree) lustre filesystem where some code
> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> threads.
>

I could get on board with:

({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})

The point of these accessors is to be used *in a syscall*.

What on Earth is Lustre doing that makes it have this problem?


Re: in_compat_syscall() returns from kernel thread for X86_32.

2018-10-17 Thread Andy Lutomirski
On Wed, Oct 17, 2018 at 6:48 PM NeilBrown  wrote:
>
>
> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to 
> in_{ia32,x32}_syscall()
> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>
> > Commit-ID:  abfb9498ee1327f534df92a7ecaea81a85913bae
> > Gitweb: 
> > http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
> > Author: Dmitry Safonov 
> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
> > Committer:  Ingo Molnar 
> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
> >
> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
> >
> ...
> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
> >
> >  static inline bool in_compat_syscall(void)
> >  {
> > - return is_ia32_task() || is_x32_task();
> > + return in_ia32_syscall() || in_x32_syscall();
> >  }
>
> Hi,
>  I'm reply to this patch largely to make sure I get the right people
>  .
>
> This test is always true when CONFIG_X86_32 is set, as that forces
> in_ia32_syscall() to true.
> However we might not be in a syscall at all - we might be running a
> kernel thread which is always in 64 mode.
> Every other implementation of in_compat_syscall() that I found is
> dependant on a thread flag or syscall register flag, and so returns
> "false" in a kernel thread.
>
> Might something like this be appropriate?
>
> diff --git a/arch/x86/include/asm/thread_info.h 
> b/arch/x86/include/asm/thread_info.h
> index 2ff2a30a264f..c265b40a78f2 100644
> --- a/arch/x86/include/asm/thread_info.h
> +++ b/arch/x86/include/asm/thread_info.h
> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * 
> const stack,
>  #ifndef __ASSEMBLY__
>
>  #ifdef CONFIG_X86_32
> -#define in_ia32_syscall() true
> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>  #else
>  #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>current_thread_info()->status & TS_COMPAT)
>
> This came up in the (no out-of-tree) lustre filesystem where some code
> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
> threads.
>

I could get on board with:

({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})

The point of these accessors is to be used *in a syscall*.

What on Earth is Lustre doing that makes it have this problem?