Hi Tejun, Sorry for the late reply.
On Fri, Aug 30, 2019 at 09:58:15PM -0700, Tejun Heo wrote: > Hello, > > On Sat, Aug 31, 2019 at 12:03:26PM +0900, Namhyung Kim wrote: > > Hmm.. it looks hard to use fhandle as the identifier since perf > > sampling is done in NMI context. AFAICS the encode_fh part seems ok > > but getting dentry/inode from a kernfs_node seems not. > > > > I assume kernfs_node_id's ino and gen are same to its inode's. Then > > we might use kernfs_node for encoding but not sure you like it ;-) > > Oh yeah, the whole cgroup id situation is kinda shitty and it's likely > that it needs to be cleaned up a bit for this to be used widely. The > issues are... > > * As identifiers, paths sucks. It's too big and unwieldy and can be > rapidly reused for different instances. > > * ino is compact but can't be easily mapped to path from userland and > also not unique. > > * The fhandle identifier - currently ino+gen - is better in that it's > finite sized and compact and can be efficiently mapped to path from > userspace. It's also mostly unique. However, the way gen is > currently generated still has some chance of the same ID getting > reused and it isn't easily accessible from inside the kernel right > now. > > Eventually, where we wanna be at is having a single 64bit identifier > which can be easily used everywhere. It should be pretty straight > forward on 64bit machines - we can just use monotonically increasing > id and use it for everything - ino, fhandle and internal cgroup id. > On 32bit, it gets a bit complicated because ino is 32bit, so it'll > need a custom allocator which bumps gen when the lower 32bit wraps and > skips in-use inos. Once we have that, we can use that for cgrp->id > and fhandle and derive ino from it. > > This is on the to-do list but obviously hasn't happened yet. If you > wanna take on it, great, but, otherwise, what can be done now is > either moving gen+ino generation into cgroup and tell kernfs to use it > or copy gen+ino into cgroup for easier access. The former likely is > the better approach given that it brings us closer to where we wanna > be eventually. So is my understanding below correct? * currently kernfs ino+gen is different than inode's ino+gen * but it'd be better to make them same * so move (generic?) inode's ino+gen logic to cgroup * and kernfs node use the same logic (and number) * so perf sampling code (NMI) just access kernfs node * and userspace can use file handle for comparison Thanks, Namhyung