David Howells writes:
> Andy Lutomirski wrote:
>
>> > Also you can't currently directly create a bind mount from userspace as you
>> > can only bind from another path point - which you may not be able to access
>> > (either by permission failure or because it's not in your mount namespace).
>> >
Andy Lutomirski wrote:
> > Whilst I'm at it, do we want the option of doing the equivalent of
> > mountat()? I.e. offering the option to open all the device files used by
> > a superblock with dfd and AT_* flags in combination with the filename?
> >
>
> Isn’t that more or less what I was sugge
On Fri, Jul 13, 2018 at 8:40 AM, David Howells wrote:
> Andy Lutomirski wrote:
>
>> > Whilst I'm at it, do we want the option of doing the equivalent of
>> > mountat()? I.e. offering the option to open all the device files used by
>> > a superblock with dfd and AT_* flags in combination with the
Andy Lutomirski wrote:
> > Whilst I'm at it, do we want the option of doing the equivalent of
> > mountat()? I.e. offering the option to open all the device files used by
> > a superblock with dfd and AT_* flags in combination with the filename?
> >
>
> Isn't that more or less what I was sugge
> On Jul 13, 2018, at 6:27 AM, David Howells wrote:
>
> Whilst I'm at it, do we want the option of doing the equivalent of mountat()?
> I.e. offering the option to open all the device files used by a superblock
> with dfd and AT_* flags in combination with the filename?
>
Isn’t that more or
Whilst I'm at it, do we want the option of doing the equivalent of mountat()?
I.e. offering the option to open all the device files used by a superblock
with dfd and AT_* flags in combination with the filename?
David
Andy Lutomirski wrote:
> > Also you can't currently directly create a bind mount from userspace as you
> > can only bind from another path point - which you may not be able to access
> > (either by permission failure or because it's not in your mount namespace).
> >
>
> Are you trying to preser
On Thu, Jul 12, 2018 at 11:54:41PM +0100, David Howells wrote:
>
> Would that mean then that doing:
>
> mount /dev/sda3 /a
> mount /dev/sda3 /b
>
> would then fail on the second command because /dev/sda3 is already open
> exclusively?
Good point. One workaround would be to require
> On Jul 12, 2018, at 5:03 PM, David Howells wrote:
>
> Andy Lutomirski wrote:
>
I tend to think that this *should* fail using the new API. The semantics
of the second mount request are bizarre at best.
>>>
>>> You still have to support existing behaviour lest you break userspace
Andy Lutomirski wrote:
> >> I tend to think that this *should* fail using the new API. The semantics
> >> of the second mount request are bizarre at best.
> >
> > You still have to support existing behaviour lest you break userspace.
> >
>
> I assume the existing behavior is that a bind mount
> On Jul 12, 2018, at 4:35 PM, David Howells wrote:
>
> Andy Lutomirski wrote:
>
>> I tend to think that this *should* fail using the new API. The semantics of
>> the second mount request are bizarre at best.
>
> You still have to support existing behaviour lest you break userspace.
>
I
Andy Lutomirski wrote:
> I tend to think that this *should* fail using the new API. The semantics of
> the second mount request are bizarre at best.
You still have to support existing behaviour lest you break userspace.
David
On Thu, Jul 12, 2018 at 4:23 PM Jann Horn wrote:
>
> On Thu, Jul 12, 2018 at 3:54 PM David Howells wrote:
> >
> > Theodore Y. Ts'o wrote:
> >
> > > So maybe the answer is that you open /dev/sda1 and /dev/sda2 and then
> > > pass the file descriptors to the fsopen object? We can require that
> >
On Thu, Jul 12, 2018 at 3:54 PM David Howells wrote:
>
> Theodore Y. Ts'o wrote:
>
> > So maybe the answer is that you open /dev/sda1 and /dev/sda2 and then
> > pass the file descriptors to the fsopen object? We can require that
> > the fd's be opened with O_RDWR and O_EXCL, which has the benefi
> On Jul 12, 2018, at 3:54 PM, David Howells wrote:
>
> Theodore Y. Ts'o wrote:
>
>> So maybe the answer is that you open /dev/sda1 and /dev/sda2 and then
>> pass the file descriptors to the fsopen object? We can require that
>> the fd's be opened with O_RDWR and O_EXCL, which has the benef
Theodore Y. Ts'o wrote:
> So maybe the answer is that you open /dev/sda1 and /dev/sda2 and then
> pass the file descriptors to the fsopen object? We can require that
> the fd's be opened with O_RDWR and O_EXCL, which has the benefit where
> if you have multiple block devices, you know *which* bl
On Thu, Jul 12, 2018 at 10:26:37PM +0100, David Howells wrote:
> The problem is that there's more than one actual "open" involved.
>
> fd = fsopen("ext4");<--- #1
> whatever_interface(fd, "s /dev/sda1");
> whatever_interface(fd, "o journal_path=/dev/sd
On Thu, Jul 12, 2018 at 2:26 PM David Howells wrote:
>
> The problem is that there's more than one actual "open" involved.
No. The problem is "write()".
This is not about open, about fsopen, or about anything at all.
This is about the fact that "write()" by definition can happen in a
different
On Thu, Jul 12, 2018 at 2:00 PM David Howells wrote:
>
>
> for example:
>
> fd = fsopen("ext4", FSOPEN_CLOEXEC);
> fsconfig(fd, fsconfig_blockdev, "dev.data", "/dev/sda1", ...);
> fsconfig(fd, fsconfig_blockdev, "dev.journal", "/dev/sda2", ...);
Ok, that looks good to me.
Linus Torvalds wrote:
> The unix semantics are that credentials are checked at open time.
Sigh.
The problem is that there's more than one actual "open" involved.
fd = fsopen("ext4");<--- #1
whatever_interface(fd, "s /dev/sda1");
whatever_inte
Andy Lutomirski wrote:
> fsconfigure(contextfd, ADD_BLOCKDEV, dfd, path, flags);
>
> fsconfigure(contextfd, ADD_OPTION, 0, “foo=bar”, flags);
That seems okayish. I'm not sure we need the flags, but I do want to allow
for binary data in an option. So perhaps something like:
int fsconf
On Thu, Jul 12, 2018 at 1:34 PM Linus Torvalds
wrote:
>
> This is the whole "write() is only for data". If you ever have
> credentials mattering at write time, you're doing something wrong.
>
> Really really.
>
> Don't do it.
.. and I'd like to repeat: we *have* done things wrong. But that's
simp
On Thu, Jul 12, 2018 at 1:23 PM David Howells wrote:
>
> It's all very well to say "use file->f_creds". The problem is this has to be
> handed down all the way through the filesystem and down into the block layer
> as appropriate to anywhere there's an LSM call, a CAP_* check or a pathwalk -
> bu
> On Jul 12, 2018, at 1:23 PM, David Howells wrote:
>
> Linus Torvalds wrote:
>
>> Don't play games with override_creds. It's wrong.
>>
>> You have to use file->f_creds - no games, no garbage.
>
> You missed the point.
>
>
> My suggestion was to use override_creds() to impose the approp
Linus Torvalds wrote:
> Don't play games with override_creds. It's wrong.
>
> You have to use file->f_creds - no games, no garbage.
You missed the point.
It's all very well to say "use file->f_creds". The problem is this has to be
handed down all the way through the filesystem and down into t
On Thu, Jul 12, 2018 at 11:30:32AM -0700, Andy Lutomirski wrote:
>
> > On Jul 12, 2018, at 11:03 AM, Greg KH wrote:
> >
> >> On Thu, Jul 12, 2018 at 06:20:24PM +0100, Al Viro wrote:
> >>> On Thu, Jul 12, 2018 at 07:15:05PM +0200, Greg KH wrote:
> On Tue, Jul 10, 2018 at 11:44:09PM +0100, Da
On Thu, Jul 12, 2018 at 07:34:26PM +0100, Al Viro wrote:
> On Thu, Jul 12, 2018 at 11:30:32AM -0700, Andy Lutomirski wrote:
>
> Andi,
Apologies for misspelling - finger macros strike ;-/
On Thu, Jul 12, 2018 at 11:30:32AM -0700, Andy Lutomirski wrote:
Andi, Greg - alt.tasteless is over -> that way.
And for fsck sake, fix your MUA. Lines are obscenely long...
> How do you mount configfs in the first place? And how do you use this in a
> mount namespace without a private config
> On Jul 12, 2018, at 11:03 AM, Greg KH wrote:
>
>> On Thu, Jul 12, 2018 at 06:20:24PM +0100, Al Viro wrote:
>>> On Thu, Jul 12, 2018 at 07:15:05PM +0200, Greg KH wrote:
On Tue, Jul 10, 2018 at 11:44:09PM +0100, David Howells wrote:
Provide an fsopen() system call that starts the proc
On Thu, Jul 12, 2018 at 06:20:24PM +0100, Al Viro wrote:
> On Thu, Jul 12, 2018 at 07:15:05PM +0200, Greg KH wrote:
> > On Tue, Jul 10, 2018 at 11:44:09PM +0100, David Howells wrote:
> > > Provide an fsopen() system call that starts the process of preparing to
> > > create a superblock that will th
> On Jul 12, 2018, at 9:58 AM, Al Viro wrote:
>
>> On Thu, Jul 12, 2018 at 09:23:22AM -0700, Andy Lutomirski wrote:
>>
>> As a straw man, I suggest:
>>
>> fsconfigure(contextfd, ADD_BLOCKDEV, dfd, path, flags);
>>
>> fsconfigure(contextfd, ADD_OPTION, 0, “foo=bar”, flags);
>
> Bollocks. F
On Thu, Jul 12, 2018 at 10:44 AM Al Viro wrote:
>
> Separating type name from everything else makes a lot of sense
I do not dispute that at all.
But you can specify the type name in the "commit" phase, it doesn't
have to be at "fsopen" time.
In fact, doing so would _force_ a certain cleanliness
On Thu, Jul 12, 2018 at 09:39:31AM -0700, Linus Torvalds wrote:
> > [1] one man's data is another man's commands, for starters. All networking
> > protocols would fit your description. So would ANSI escape sequences ("move
> > cursor to line 12 column 45" does sound like a command), so would wri
On Thu, Jul 12, 2018 at 10:14:05AM -0700, Linus Torvalds wrote:
> On Thu, Jul 12, 2018 at 9:39 AM Linus Torvalds
> wrote:
> >
> > I agree that a system call is likely saner. Especially since we'd have
> > one to _start_ this (ie "fsopen()") it would make sense to have the
> > one to finalize it.
>
On Thu, Jul 12, 2018 at 07:15:05PM +0200, Greg KH wrote:
> On Tue, Jul 10, 2018 at 11:44:09PM +0100, David Howells wrote:
> > Provide an fsopen() system call that starts the process of preparing to
> > create a superblock that will then be mountable, using an fd as a context
> > handle. fsopen() i
On Tue, Jul 10, 2018 at 11:44:09PM +0100, David Howells wrote:
> Provide an fsopen() system call that starts the process of preparing to
> create a superblock that will then be mountable, using an fd as a context
> handle. fsopen() is given the name of the filesystem that will be used:
>
>
On Thu, Jul 12, 2018 at 9:39 AM Linus Torvalds
wrote:
>
> I agree that a system call is likely saner. Especially since we'd have
> one to _start_ this (ie "fsopen()") it would make sense to have the
> one to finalize it.
Side note: if we can make do with just a buffer, then we wouldn't need
"fsop
On Thu, Jul 12, 2018 at 09:23:22AM -0700, Andy Lutomirski wrote:
> As a straw man, I suggest:
>
> fsconfigure(contextfd, ADD_BLOCKDEV, dfd, path, flags);
>
> fsconfigure(contextfd, ADD_OPTION, 0, “foo=bar”, flags);
Bollocks. First of all, block device *IS* a fucking option.
Always had been.
On Thu, Jul 12, 2018 at 09:23:22AM -0700, Andy Lutomirski wrote:
> If you make a syscall that attaches a block device to an fscontext, you don’t
> need any of this. Heck, someone might actually *want* to grab a block device
> from a different namespace.
Fuck, NO. The whole notion of "block de
On Thu, Jul 12, 2018 at 9:31 AM Al Viro wrote:
>
> And seriously, ioctl? _That_ has a great track record...
I agree that a system call is likely saner. Especially since we'd have
one to _start_ this (ie "fsopen()") it would make sense to have the
one to finalize it.
> [1] one man's data is anoth
On Thu, Jul 12, 2018 at 9:23 AM Andy Lutomirski wrote:
>
> (Al- can’t we just stop allowing splice() at all on things that don’t use
> iov_iter?)
We could add a FMODE_SPLICE_READ/WRITE bit, and let people opt in to
splice. We probably should have.
But again, that really doesn't change the funda
On Thu, Jul 12, 2018 at 09:07:36AM -0700, Linus Torvalds wrote:
> On Thu, Jul 12, 2018 at 9:00 AM Al Viro wrote:
> >
> > Wait a sec - that's only a problem if your command contains pointer-chasing
> > et.al.
>
> No.
>
> It's a problem if anybody ever does something like "let's have a
> helper sp
> On Jul 12, 2018, at 7:54 AM, David Howells wrote:
>
> Andy Lutomirski wrote:
>
>>> On Jul 11, 2018, at 12:22 AM, David Howells wrote:
>>>
>>> Andy Lutomirski wrote:
>>>
> sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length ar
On Thu, Jul 12, 2018 at 9:00 AM Al Viro wrote:
>
> Wait a sec - that's only a problem if your command contains pointer-chasing
> et.al.
No.
It's a problem if anybody ever does something like "let's have a
helper splice thread that uses splice to move data automatically from
one buffer to another
On Thu, Jul 12, 2018 at 08:50:46AM -0700, Linus Torvalds wrote:
> But "write()" simply is *NOT* a good "command" interface. If you want
> to send a command, use an ioctl or a system call.
>
> Because it's not just about credentials. It's not just about fooling a
> suid app into writing an error m
On Thu, Jul 12, 2018 at 7:54 AM David Howells wrote:
>
> I think we *have* to open the source files/devices with the creds of whoever
> called fsopen() or fspick() - that way you can't upgrade your privs by passing
> your context fd to a suid program. To enforce this, I think it's simplest for
>
Andy Lutomirski wrote:
> > On Jul 11, 2018, at 12:22 AM, David Howells wrote:
> >
> > Andy Lutomirski wrote:
> >
> >>> sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> >>> write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
> >>
> >> Imagine some malicious program passes sfd as stdout
> On Jul 11, 2018, at 12:22 AM, David Howells wrote:
>
> Andy Lutomirski wrote:
>
>>> sfd = fsopen("ext4", FSOPEN_CLOEXEC);
>>> write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
>>
>> Imagine some malicious program passes sfd as stdout to a setuid
>> program. That program get
On Wed, Jul 11, 2018 at 08:22:41AM +0100, David Howells wrote:
> Andy Lutomirski wrote:
>
> > >sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> > >write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
> >
> > Imagine some malicious program passes sfd as stdout to a setuid
> > program
Jonathan Corbet wrote:
> A minor detail but ... the "r" operation mentioned above is not actually
> implemented in this system call.
Yeah, that's something I'd like to add. NFS4 already does this inside its
->mount() method, so my thought is that we might be able to move this from
there to the
On Wed, Jul 11, 2018 at 1:42 AM David Howells wrote:
>
> Buffering till the end means you have to buffer *everything* - and,
> unless you limit your buffer, you risk running out of RAM
Do we really care?
Can't we limit the buffer size to something small?
Right now, the mount options c
On Tue, 10 Jul 2018 23:44:09 +0100
David Howells wrote:
> sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
> write(sfd, "o noatime");
> write(sfd, "o acl");
> write(sfd, "o user_attr");
> write(sfd, "o i
Andy Lutomirski wrote:
> > Umm... How about "use credentials of opener for everything"?
>
> If you want to audit every single filesystem for any code that uses
> credentials for anything and add all the right kernel APIs and make
> sure the filesystem uses them and somehow keep screwups from ge
Linus Torvalds wrote:
> Yeah, Andy is right that we should *not* make "write()" have side effects.
Note that write() has side effects all over the place: procfs, sysfs, debugfs,
tracefs, ... Though for the most part they're single-shot jobs and not
cumulative (I'm not sure this is always true f
Andy Lutomirski wrote:
> >sfd = fsopen("ext4", FSOPEN_CLOEXEC);
> >write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
>
> Imagine some malicious program passes sfd as stdout to a setuid
> program. That program gets persuaded to write "s /etc/shadow". What
> happens? You
On Tue, Jul 10, 2018 at 6:15 PM Al Viro wrote:
>
> Umm... How about "use credentials of opener for everything"?
yeah, we have that for writes in general.
Nobody ever actually follows that rule. They may *think* they do, and
then they call to some helper that does "capability(CAP_SYS_WHATEVAH)"
On Tue, Jul 10, 2018 at 6:15 PM, Al Viro wrote:
> On Tue, Jul 10, 2018 at 06:05:49PM -0700, Linus Torvalds wrote:
>> Yeah, Andy is right that we should *not* make "write()" have side effects.
>>
>> Use it to queue things by all means, but not "do" things. Not unless
>> there's a very sane security
On Tue, Jul 10, 2018 at 06:14:10PM -0700, Jann Horn wrote:
> I also love ioctls, so I think you could also use an ioctl to do the
> commit? You can do anything (well, almost anything) that you can do in
> syscall context in ioctl context, too; and when you already have a
> file descriptor of a spe
On Tue, Jul 10, 2018 at 06:05:49PM -0700, Linus Torvalds wrote:
> Yeah, Andy is right that we should *not* make "write()" have side effects.
>
> Use it to queue things by all means, but not "do" things. Not unless
> there's a very sane security model.
>
> On Tue, Jul 10, 2018 at 4:59 PM Andy Luto
On Tue, Jul 10, 2018 at 4:59 PM Andy Lutomirski wrote:
>
> [cc Jann - you love this stuff]
>
> > On Jul 10, 2018, at 3:44 PM, David Howells wrote:
> >
> > Provide an fsopen() system call that starts the process of preparing to
> > create a superblock that will then be mountable, using an fd as a
Yeah, Andy is right that we should *not* make "write()" have side effects.
Use it to queue things by all means, but not "do" things. Not unless
there's a very sane security model.
On Tue, Jul 10, 2018 at 4:59 PM Andy Lutomirski wrote:
>
> I think the right solution is one of:
>
> (a) Pass a netl
[cc Jann - you love this stuff]
> On Jul 10, 2018, at 3:44 PM, David Howells wrote:
>
> Provide an fsopen() system call that starts the process of preparing to
> create a superblock that will then be mountable, using an fd as a context
> handle. fsopen() is given the name of the filesystem that
Provide an fsopen() system call that starts the process of preparing to
create a superblock that will then be mountable, using an fd as a context
handle. fsopen() is given the name of the filesystem that will be used:
int mfd = fsopen(const char *fsname, unsigned int flags);
where flags
63 matches
Mail list logo