On 08/08, Eric W. Biederman wrote:
>
> Oleg Nesterov <o...@redhat.com> writes:
>
> > To me it looks a bit annoying that task_mmu.c needs 6 seq_operations's and
> > 6 file_operations's to handle /proc/pid/*maps*. And _only_ because ->show()
> > differs.
> >
> > Eric, et al, what do you think? At least something like 1-3 looks like a
> > good cleanup imho. And afaics we can do more cleanups on top.
>
>
> I see where you are getting annoyed.
>
> Taking a quick look at task_mmu.c  It looks like the tgid vs pid logic
> to decided which stack or stacks to display is simply incorrect.

Yes, probably, but please forget this for now. Because,

> Given where you are starting I think tack_mmu.c code that decides
> when/which stack deserves a serious audit.

Yes. And more. At least this code needs more cleanups. ->task should
die or at least we should avoid get/put_task_struct, ->pid can die too,
hold_task_mempolicy() doesn't look correct (at least the "prevent changing
our mempolicy while show_numa_maps" comment and down_write() in
do_set_mempolicy(). I am going to try to cleanup this a bit after I change
task_nommu.c to avoid mm_access() in m_start().

But this is off-topic,

> At a practical level moving pid_entry into the proc inode is ugly
> especially for the hack that is is_tgid_pid_entry.

Well, I am not going to argue with maintainer, mostly simply because
I do not understand this code enough ;)

But let me say that I disagree. We already have ->fd, and ->sysctl*.
I see nothing wrong if *id_base_stuff files have more info for free.

And imo proc_inode->sysctl* is much worse. Simply because they are
not private to fs/proc/proc_sysctl.c.

Could you please look into the attached mbox? I am just curious if we
can do something like this in a clean way. In particular, please look at
"Note:" comment in 3/3. Perhaps proc_sys_init() can do proc_get_inode(),
initialize/instantiate it...

To clarify, of course it is not that I want to shrink sizeof(proc_inode).
Just to me it doesn't look clean that proc_evict_inode() has to do
sysctl_head_put(), grab_header() assumes that ->sysctl == NULL means
sysctl_table_root.*, etc.

> That test could be implemented more easily by looking at the parent
> directories inode operations and seeing if they are
> proc_root_inode_operations.

Hmm. Looking at inode? How?

Oleg.
>From a4c94461ae18b2d6cc2e8a9cfb042d0b6cc46e86 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <o...@redhat.com>
Date: Sat, 9 Aug 2014 17:25:53 +0200
Subject: [PATCH 1/3] proc: introduce proc_sys_evict_inode()

Preparation. Shift the sysctl_head_put() code from proc_evict_inode()
into the new trivial helper, make sysctl_head_put() static.

---
 fs/proc/inode.c       |    8 ++------
 fs/proc/internal.h    |    4 ++--
 fs/proc/proc_sysctl.c |   13 ++++++++++++-
 3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 0adbc02..9f6715a 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -31,7 +31,6 @@
 static void proc_evict_inode(struct inode *inode)
 {
        struct proc_dir_entry *de;
-       struct ctl_table_header *head;
        const struct proc_ns_operations *ns_ops;
        void *ns;
 
@@ -45,11 +44,8 @@ static void proc_evict_inode(struct inode *inode)
        de = PROC_I(inode)->pde;
        if (de)
                pde_put(de);
-       head = PROC_I(inode)->sysctl;
-       if (head) {
-               RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
-               sysctl_head_put(head);
-       }
+
+       proc_sys_evict_inode(inode);
        /* Release any associated namespace */
        ns_ops = PROC_I(inode)->ns.ns_ops;
        ns = PROC_I(inode)->ns.ns;
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 78784cd..7ce3262 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -238,10 +238,10 @@ extern int proc_setup_self(struct super_block *);
  */
 #ifdef CONFIG_PROC_SYSCTL
 extern int proc_sys_init(void);
-extern void sysctl_head_put(struct ctl_table_header *);
+extern void proc_sys_evict_inode(struct inode *inode);
 #else
 static inline void proc_sys_init(void) { }
-static inline void sysctl_head_put(struct ctl_table_header *head) { }
+static inline void proc_sys_evict_inode(struct inode *inode) { }
 #endif
 
 /*
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 7129046..2fa67e7 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -256,7 +256,7 @@ static void sysctl_head_get(struct ctl_table_header *head)
        spin_unlock(&sysctl_lock);
 }
 
-void sysctl_head_put(struct ctl_table_header *head)
+static void sysctl_head_put(struct ctl_table_header *head)
 {
        spin_lock(&sysctl_lock);
        if (!--head->count)
@@ -264,6 +264,17 @@ void sysctl_head_put(struct ctl_table_header *head)
        spin_unlock(&sysctl_lock);
 }
 
+void proc_sys_evict_inode(struct inode *inode)
+{
+       struct ctl_table_header *head;
+
+       head = PROC_I(inode)->sysctl;
+       if (head) {
+               RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
+               sysctl_head_put(head);
+       }
+}
+
 static struct ctl_table_header *sysctl_head_grab(struct ctl_table_header *head)
 {
        BUG_ON(!head);
-- 
1.5.5.1


>From e0b42f4770cd76069e4e70e48e4e7bfa84fab4bf Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <o...@redhat.com>
Date: Sat, 9 Aug 2014 17:37:31 +0200
Subject: [PATCH 2/3] proc: change proc_sys_evict_inode() to check inode->i_op

Change proc_sys_evict_inode() to verify that this inode is really
a /proc/sys inode before using ->sysctl.

---
 fs/proc/proc_sysctl.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 2fa67e7..863aaee 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -268,6 +268,10 @@ void proc_sys_evict_inode(struct inode *inode)
 {
        struct ctl_table_header *head;
 
+       if (inode->i_op != &proc_sys_inode_operations &&
+           inode->i_op != &proc_sys_dir_operations)
+               return;
+
        head = PROC_I(inode)->sysctl;
        if (head) {
                RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL);
-- 
1.5.5.1


>From 9c8e98fc45f091d5d2a4bfc6dcd86b67275160a1 Mon Sep 17 00:00:00 2001
From: Oleg Nesterov <o...@redhat.com>
Date: Sun, 10 Aug 2014 17:37:25 +0200
Subject: [PATCH 3/3] proc: place proc_inode->sysctl* and proc_inode->fd into a 
union

->sysctl* and ->fd members can share the memory, add a union.

Note: unfortunately proc_alloc_inode() still has to initialize ->sysctl*
members. And the only (afaics) reason is that proc_mkdir("sys") relies
on ->sysctl == NULL which means sysctl_table_root in grab_header(). It
would be nice to initialize these members after proc_get_inode("sys")
somehow and remove the special "NULL means global root" case in proc_sys
paths somehow, but I do not see how.

---
 fs/proc/internal.h |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 7ce3262..8d35ac0 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -60,11 +60,15 @@ union proc_op {
 
 struct proc_inode {
        struct pid *pid;
-       int fd;
-       union proc_op op;
        struct proc_dir_entry *pde;
-       struct ctl_table_header *sysctl;
-       struct ctl_table *sysctl_entry;
+       union proc_op op;
+       union {
+               struct {
+                       struct ctl_table_header *sysctl;
+                       struct ctl_table *sysctl_entry;
+               };
+               int fd;
+       };
        struct proc_ns ns;
        struct inode vfs_inode;
 };
-- 
1.5.5.1

Reply via email to