Frederick Lawler <f...@cloudflare.com> writes:

> On 6/13/22 11:44 PM, Eric W. Biederman wrote:
>> Frederick Lawler <f...@cloudflare.com> writes:
>> 
>>> Hi Eric,
>>>
>>> On 6/13/22 12:04 PM, Eric W. Biederman wrote:
>>>> Frederick Lawler <f...@cloudflare.com> writes:
>>>>
>>>>> While experimenting with the security_prepare_creds() LSM hook, we
>>>>> noticed that our EPERM error code was not propagated up the callstack.
>>>>> Instead ENOMEM is always returned.  As a result, some tools may send a
>>>>> confusing error message to the user:
>>>>>
>>>>> $ unshare -rU
>>>>> unshare: unshare failed: Cannot allocate memory
>>>>>
>>>>> A user would think that the system didn't have enough memory, when
>>>>> instead the action was denied.
>>>>>
>>>>> This problem occurs because prepare_creds() and prepare_kernel_cred()
>>>>> return NULL when security_prepare_creds() returns an error code. Later,
>>>>> functions calling prepare_creds() and prepare_kernel_cred() return
>>>>> ENOMEM because they assume that a NULL meant there was no memory
>>>>> allocated.
>>>>>
>>>>> Fix this by propagating an error code from security_prepare_creds() up
>>>>> the callstack.
>>>> Why would it make sense for security_prepare_creds to return an error
>>>> code other than ENOMEM?
>>>>   > That seems a bit of a violation of what that function is supposed to do
>>>>
>>>
>>> The API allows LSM authors to decide what error code is returned from the
>>> cred_prepare hook. security_task_alloc() is a similar hook, and has its 
>>> return
>>> code propagated.
>> It is not an api.  It is an implementation detail of the linux kernel.
>> It is a set of convenient functions that do a job.
>> The general rule is we don't support cases without an in-tree user.  I
>> don't see an in-tree user.
>> 
>>> I'm proposing we follow security_task_allocs() pattern, and add visibility 
>>> for
>>> failure cases in prepare_creds().
>> I am asking why we would want to.  Especially as it is not an API, and I
>> don't see any good reason for anything but an -ENOMEM failure to be
>> supported.
>>
> We're writing a LSM BPF policy, and not a new LSM. Our policy aims to solve
> unprivileged unshare, similar to Debian's patch [1]. We're in a position such
> that we can't use that patch because we can't block _all_ of our applications
> from performing an unshare. We prefer a granular approach. LSM BPF seems like 
> a
> good choice.

I am quite puzzled why doesn't /proc/sys/user/max_user_namespaces work
for you?

> Because LSM BPF exposes these hooks, we should probably treat them as an
> API. From that perspective, userspace expects unshare to return a EPERM 
> when the call is denied permissions.

The BPF code gets to be treated as a out of tree kernel module.

>> Without an in-tree user that cares it is probably better to go the
>> opposite direction and remove the possibility of return anything but
>> memory allocation failure.  That will make it clearer to implementors
>> that a general error code is not supported and this is not a location
>> to implement policy, this is only a hook to allocate state for the LSM.
>> 
>
> That's a good point, and it's possible we're using the wrong hook for the
> policy. Do you know of other hooks we can look into?

Not off the top of my head.

>>>> I have probably missed a very interesting discussion where that was
>>>> mentioned but I don't see link to the discussion or anything explaining
>>>> why we want to do that in this change.
>>>>
>>>
>>> AFAIK, this is the start of the discussion.
>> You were on v3 and had an out of tree piece of code so I assumed someone
>> had at least thought about why you want to implement policy in a piece
>> of code whose only purpose is to allocate memory to store state.
>> 
>
> No worries.
>
>> Eric
>> 
>> 
>
> Links:
> 1:
> https://sources.debian.org/patches/linux/3.16.56-1+deb8u1/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch/

Eric

--
Linux-cachefs mailing list
Linux-cachefs@redhat.com
https://listman.redhat.com/mailman/listinfo/linux-cachefs

Reply via email to