On Sun, Dec 23, 2018 at 12:29:44PM +0100, Greg KH wrote: > On Sat, Dec 22, 2018 at 10:18:06PM +0100, Christian Brauner wrote: > > Since binderfs can be mounted by userns root in non-initial user namespaces > > some precautions are in order. First, a way to set a maximum on the number > > of binder devices that can be allocated per binderfs instance and second, a > > way to reserve a reasonable chunk of binderfs devices for the initial ipc > > namespace. > > A first approach as seen in [1] used sysctls similiar to devpts but was > > shown to be flawed (cf. [2] and [3]) since some aspects were unneeded. This > > is an alternative approach which avoids sysctls completely and instead > > switches to a single mount option. > > > > Starting with this commit binderfs instances can be mounted with a limit on > > the number of binder devices that can be allocated. The max=<count> mount > > option serves as a per-instance limit. If max=<count> is set then only > > <count> number of binder devices can be allocated in this binderfs > > instance. > > Ok, this is fine, but why such a big default? You only need 4 to run a > modern android system, and anyone using binder outside of android is > really too crazy to ever be using it in a container :) > > > Additionally, the binderfs instance in the initial ipc namespace will > > always have a reserve of at least 1024 binder devices unless explicitly > > capped via max=<count>. > > Again, why so many? And why wouldn't that initial ipc namespace already > have their device nodes created _before_ anything else is mounted?
Right, my issue is with re-creating devices, like if binderfs gets unmounted or if devices get removed via rm. But we can lower the number to 4 (see below). > > Some comments on the patch below: Thanks! > > > +/* > > + * Ensure that the initial ipc namespace always has a good chunk of devices > > + * available. > > + */ > > +#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 1024) > > Again that seems crazy big, how about splitting this into two different > patches, one for the max= stuff, and one for this "reserve some minors" > thing, so we can review them separately. Yes, let's do that. I will also lower this to 4 reserved devices. > > > > > static struct vfsmount *binderfs_mnt; > > > > @@ -46,6 +52,24 @@ static dev_t binderfs_dev; > > static DEFINE_MUTEX(binderfs_minors_mutex); > > static DEFINE_IDA(binderfs_minors); > > > > +/** > > + * binderfs_mount_opts - mount options for binderfs > > + * @max: maximum number of allocatable binderfs binder devices > > + */ > > +struct binderfs_mount_opts { > > + int max; > > +}; > > + > > +enum { > > + Opt_max, > > + Opt_err > > +}; > > + > > +static const match_table_t tokens = { > > + { Opt_max, "max=%d" }, > > + { Opt_err, NULL } > > +}; > > + > > /** > > * binderfs_info - information about a binderfs mount > > * @ipc_ns: The ipc namespace the binderfs mount belongs to. > > @@ -55,13 +79,16 @@ static DEFINE_IDA(binderfs_minors); > > * created. > > * @root_gid: gid that needs to be used when a new binder device is > > * created. > > + * @mount_opts: The mount options in use. > > + * @device_count: The current number of allocated binder devices. > > */ > > struct binderfs_info { > > struct ipc_namespace *ipc_ns; > > struct dentry *control_dentry; > > kuid_t root_uid; > > kgid_t root_gid; > > - > > + struct binderfs_mount_opts mount_opts; > > + atomic_t device_count; > > Why atomic? > > You should already have the lock held every time this is accessed, > so no need to use an atomic value, just use an int. > > > /* Reserve new minor number for the new device. */ > > mutex_lock(&binderfs_minors_mutex); > > - minor = ida_alloc_max(&binderfs_minors, BINDERFS_MAX_MINOR, GFP_KERNEL); > > + if (atomic_inc_return(&info->device_count) < info->mount_opts.max) > > No need for atomic, see, your lock is held :) Habit, to be honest. Thanks, fixed version to follow in a bit. Christian