Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-04 Thread Oleg Drokin
Hello!

On Feb 4, 2014, at 11:57 AM, Greg Kroah-Hartman wrote:

> What exactly are you doing here?  Calling out to userspace for what
> information?  And how are you going to handle namespaces and containers
> by doing that?  Are you going to block in the kernel for this
> information?
> 
> What are you trying to solve with this code in the first place?

So, as an overview of a feature:

When you have tens of thousand (and even hundreds) of nodes doing IO,
it's no longer practical to tell them apart separately or in some
network-related groups based on their address server-side
(for purposes like monitoring and load control).
Since such systems are usually managed by some sort of a job/batch scheduler,
it's much more natural to organize them into "jobs" as known to the
job scheduler instead. This job id information is harvested and added to
all RPCs so that server side can do useful things with this information
(like aggregate statistics, identify pathologically bad workcases,
QoS and so on).

Most of batch schedulers out there let userspace know their own JOBID as an
environment variable. So original implementation was just harvesting this
info directly from process environment. I certainly can see why this is not
really desirable.

So, the patch does away with the environment parsing, instead it adds two
new venues of getting this information:

1. In vast majorities of cases entire node is dedicated to a single job, so
we just create a /proc variable where you can input job id from job scheduler
prologue (and then clear it from an epilogue). Getting jobid in this case does
does not dip in userspace anymore. This also does not block anywhere.

2. In some more rare systems with lots of cores they actually seem to be 
subdividing
individual nodes across jobs. Additionally all systems usually have 
login/interactive
nodes. While these sort of interactive nodes do not have jobs scheduled on 
them, it
still might be useful to distinguish between different user sessions happening 
there.
This is where the upcalls come into play. First time a process does IO an upcall
would be called that would provide the kernel with jobid identifier (however it 
might
want to obtain it, we don't really care at this point). This would block (with 
a timeout)
for the reply. The reply is then cached and reused for subsequent io from the 
same process.
I did not really think about containers before, but I think it would work out 
anyway:
I think namespaces and containers still have non-intersecting pid space so we 
should be
fine in this regard. There is a "master container"/namespace of some sort I 
think, that is able to see
entire pid space, and that's where the upcall would be run (do I need to 
somehow force that?)

Bye,
Oleg

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-04 Thread Greg Kroah-Hartman
On Tue, Feb 04, 2014 at 10:12:10AM +0400, Oleg Drokin wrote:
> Hello!
> 
> On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote:
> > > - * stored in between the "env_start" & "env_end" of task struct.
> > > +static char *self_environ_file = "/proc/self/environ";
> > 
> > Heh, no, that's not ok at all.
> > 
> > This is a _huge_ sign that you are doing something wrong in your driver
> > if you need something that isn't exported, or that you have to dig out
> > of proc.
> > 
> > Sorry, I can't take this, please fix the underlying problems that would
> > even think that you need access to the environment from within a kernel
> > driver.
> 
> I took a stab at this.
> This is not a final patch, I know there's still some number of checkpatch
> warnings and the proc layout is not finalized yet for example.
> 
> But before I spend any more time in polishing this, can you please take a look
> and advise if this direction would be acceptable for you when driven to
> completion?

What exactly are you doing here?  Calling out to userspace for what
information?  And how are you going to handle namespaces and containers
by doing that?  Are you going to block in the kernel for this
information?

What are you trying to solve with this code in the first place?

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-04 Thread Greg Kroah-Hartman
On Tue, Feb 04, 2014 at 10:12:10AM +0400, Oleg Drokin wrote:
 Hello!
 
 On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote:
   - * stored in between the env_start  env_end of task struct.
   +static char *self_environ_file = /proc/self/environ;
  
  Heh, no, that's not ok at all.
  
  This is a _huge_ sign that you are doing something wrong in your driver
  if you need something that isn't exported, or that you have to dig out
  of proc.
  
  Sorry, I can't take this, please fix the underlying problems that would
  even think that you need access to the environment from within a kernel
  driver.
 
 I took a stab at this.
 This is not a final patch, I know there's still some number of checkpatch
 warnings and the proc layout is not finalized yet for example.
 
 But before I spend any more time in polishing this, can you please take a look
 and advise if this direction would be acceptable for you when driven to
 completion?

What exactly are you doing here?  Calling out to userspace for what
information?  And how are you going to handle namespaces and containers
by doing that?  Are you going to block in the kernel for this
information?

What are you trying to solve with this code in the first place?

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-04 Thread Oleg Drokin
Hello!

On Feb 4, 2014, at 11:57 AM, Greg Kroah-Hartman wrote:

 What exactly are you doing here?  Calling out to userspace for what
 information?  And how are you going to handle namespaces and containers
 by doing that?  Are you going to block in the kernel for this
 information?
 
 What are you trying to solve with this code in the first place?

So, as an overview of a feature:

When you have tens of thousand (and even hundreds) of nodes doing IO,
it's no longer practical to tell them apart separately or in some
network-related groups based on their address server-side
(for purposes like monitoring and load control).
Since such systems are usually managed by some sort of a job/batch scheduler,
it's much more natural to organize them into jobs as known to the
job scheduler instead. This job id information is harvested and added to
all RPCs so that server side can do useful things with this information
(like aggregate statistics, identify pathologically bad workcases,
QoS and so on).

Most of batch schedulers out there let userspace know their own JOBID as an
environment variable. So original implementation was just harvesting this
info directly from process environment. I certainly can see why this is not
really desirable.

So, the patch does away with the environment parsing, instead it adds two
new venues of getting this information:

1. In vast majorities of cases entire node is dedicated to a single job, so
we just create a /proc variable where you can input job id from job scheduler
prologue (and then clear it from an epilogue). Getting jobid in this case does
does not dip in userspace anymore. This also does not block anywhere.

2. In some more rare systems with lots of cores they actually seem to be 
subdividing
individual nodes across jobs. Additionally all systems usually have 
login/interactive
nodes. While these sort of interactive nodes do not have jobs scheduled on 
them, it
still might be useful to distinguish between different user sessions happening 
there.
This is where the upcalls come into play. First time a process does IO an upcall
would be called that would provide the kernel with jobid identifier (however it 
might
want to obtain it, we don't really care at this point). This would block (with 
a timeout)
for the reply. The reply is then cached and reused for subsequent io from the 
same process.
I did not really think about containers before, but I think it would work out 
anyway:
I think namespaces and containers still have non-intersecting pid space so we 
should be
fine in this regard. There is a master container/namespace of some sort I 
think, that is able to see
entire pid space, and that's where the upcall would be run (do I need to 
somehow force that?)

Bye,
Oleg

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-03 Thread Oleg Drokin
Hello!

On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote:
> > - * stored in between the "env_start" & "env_end" of task struct.
> > +static char *self_environ_file = "/proc/self/environ";
> 
> Heh, no, that's not ok at all.
> 
> This is a _huge_ sign that you are doing something wrong in your driver
> if you need something that isn't exported, or that you have to dig out
> of proc.
> 
> Sorry, I can't take this, please fix the underlying problems that would
> even think that you need access to the environment from within a kernel
> driver.

I took a stab at this.
This is not a final patch, I know there's still some number of checkpatch
warnings and the proc layout is not finalized yet for example.

But before I spend any more time in polishing this, can you please take a look
and advise if this direction would be acceptable for you when driven to
completion?

Thanks.


>From 6a5b58657cc32163738d4a8c210e8683159b582f Mon Sep 17 00:00:00 2001
From: Oleg Drokin 
Date: Tue, 4 Feb 2014 00:32:12 -0500
Subject: [PATCH] staging/lustre: Obtain jobid invormation via upcall

Replace lustre jobid information fetching directly from
process env variable with either node-wide jobid obtained via
a proc file, or through an upcall that would provide the jobid
if more fine-grained operations are necessary.

Signed-off-by: Oleg Drokin 
---
 .../staging/lustre/include/linux/libcfs/curproc.h  |   1 -
 .../staging/lustre/include/linux/libcfs/lucache.h  |   6 +
 .../staging/lustre/lustre/include/lprocfs_status.h |   1 +
 .../lustre/lustre/libcfs/linux/linux-curproc.c | 152 -
 drivers/staging/lustre/lustre/obdclass/Makefile|   2 +-
 drivers/staging/lustre/lustre/obdclass/class_obd.c |  67 
 .../staging/lustre/lustre/obdclass/jobid_cache.c   | 181 +
 .../lustre/lustre/obdclass/jobid_internal.h|  10 ++
 .../lustre/lustre/obdclass/linux/linux-module.c| 114 +
 9 files changed, 345 insertions(+), 189 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_cache.c
 create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_internal.h

diff --git a/drivers/staging/lustre/include/linux/libcfs/curproc.h 
b/drivers/staging/lustre/include/linux/libcfs/curproc.h
index 507d16b..cf1f26b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/curproc.h
+++ b/drivers/staging/lustre/include/linux/libcfs/curproc.h
@@ -63,7 +63,6 @@ intcfs_curproc_groups_nr(void);
 /* check if task is running in compat mode.*/
 #define current_pid()  (current->pid)
 #define current_comm() (current->comm)
-int cfs_get_environ(const char *key, char *value, int *val_len);
 
 typedef __u32 cfs_cap_t;
 
diff --git a/drivers/staging/lustre/include/linux/libcfs/lucache.h 
b/drivers/staging/lustre/include/linux/libcfs/lucache.h
index 9668b39..f8361d7 100644
--- a/drivers/staging/lustre/include/linux/libcfs/lucache.h
+++ b/drivers/staging/lustre/include/linux/libcfs/lucache.h
@@ -82,6 +82,11 @@ struct md_identity {
struct md_perm  *mi_perms;
 };
 
+struct jobid_cache_entry {
+   struct upcall_cache_entry *jce_uc_entry;
+   char  *jce_jobid;
+};
+
 struct upcall_cache_entry {
struct list_head  ue_hash;
__u64  ue_key;
@@ -92,6 +97,7 @@ struct upcall_cache_entry {
cfs_time_tue_expire;
union {
struct md_identity identity;
+   struct jobid_cache_entry jobid;
} u;
 };
 
diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
b/drivers/staging/lustre/lustre/include/lprocfs_status.h
index 428e3e4..3c99dcf 100644
--- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
+++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
@@ -369,6 +369,7 @@ static inline void s2dhms(struct dhms *ts, time_t secs)
 #define JOBSTATS_JOBID_VAR_MAX_LEN 20
 #define JOBSTATS_DISABLE   "disable"
 #define JOBSTATS_PROCNAME_UID  "procname_uid"
+#define JOBSTATS_NODELOCAL "nodelocal"
 
 extern int lprocfs_write_frac_helper(const char *buffer, unsigned long count,
 int *val, int mult);
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c 
b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
index a2ef64c..7c48601 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
@@ -140,158 +140,6 @@ int cfs_capable(cfs_cap_t cap)
return capable(cfs_cap_unpack(cap));
 }
 
-static int cfs_access_process_vm(struct task_struct *tsk, unsigned long addr,
-void *buf, int len, int write)
-{
-   /* Just copied from kernel for the kernels which doesn't
-* have access_process_vm() exported */
-   struct mm_struct *mm;
-   struct vm_area_struct *vma;
-   struct page *page;
-   void 

Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2014-02-03 Thread Oleg Drokin
Hello!

On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote:
  - * stored in between the env_start  env_end of task struct.
  +static char *self_environ_file = /proc/self/environ;
 
 Heh, no, that's not ok at all.
 
 This is a _huge_ sign that you are doing something wrong in your driver
 if you need something that isn't exported, or that you have to dig out
 of proc.
 
 Sorry, I can't take this, please fix the underlying problems that would
 even think that you need access to the environment from within a kernel
 driver.

I took a stab at this.
This is not a final patch, I know there's still some number of checkpatch
warnings and the proc layout is not finalized yet for example.

But before I spend any more time in polishing this, can you please take a look
and advise if this direction would be acceptable for you when driven to
completion?

Thanks.


From 6a5b58657cc32163738d4a8c210e8683159b582f Mon Sep 17 00:00:00 2001
From: Oleg Drokin gr...@linuxhacker.ru
Date: Tue, 4 Feb 2014 00:32:12 -0500
Subject: [PATCH] staging/lustre: Obtain jobid invormation via upcall

Replace lustre jobid information fetching directly from
process env variable with either node-wide jobid obtained via
a proc file, or through an upcall that would provide the jobid
if more fine-grained operations are necessary.

Signed-off-by: Oleg Drokin oleg.dro...@intel.com
---
 .../staging/lustre/include/linux/libcfs/curproc.h  |   1 -
 .../staging/lustre/include/linux/libcfs/lucache.h  |   6 +
 .../staging/lustre/lustre/include/lprocfs_status.h |   1 +
 .../lustre/lustre/libcfs/linux/linux-curproc.c | 152 -
 drivers/staging/lustre/lustre/obdclass/Makefile|   2 +-
 drivers/staging/lustre/lustre/obdclass/class_obd.c |  67 
 .../staging/lustre/lustre/obdclass/jobid_cache.c   | 181 +
 .../lustre/lustre/obdclass/jobid_internal.h|  10 ++
 .../lustre/lustre/obdclass/linux/linux-module.c| 114 +
 9 files changed, 345 insertions(+), 189 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_cache.c
 create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_internal.h

diff --git a/drivers/staging/lustre/include/linux/libcfs/curproc.h 
b/drivers/staging/lustre/include/linux/libcfs/curproc.h
index 507d16b..cf1f26b 100644
--- a/drivers/staging/lustre/include/linux/libcfs/curproc.h
+++ b/drivers/staging/lustre/include/linux/libcfs/curproc.h
@@ -63,7 +63,6 @@ intcfs_curproc_groups_nr(void);
 /* check if task is running in compat mode.*/
 #define current_pid()  (current-pid)
 #define current_comm() (current-comm)
-int cfs_get_environ(const char *key, char *value, int *val_len);
 
 typedef __u32 cfs_cap_t;
 
diff --git a/drivers/staging/lustre/include/linux/libcfs/lucache.h 
b/drivers/staging/lustre/include/linux/libcfs/lucache.h
index 9668b39..f8361d7 100644
--- a/drivers/staging/lustre/include/linux/libcfs/lucache.h
+++ b/drivers/staging/lustre/include/linux/libcfs/lucache.h
@@ -82,6 +82,11 @@ struct md_identity {
struct md_perm  *mi_perms;
 };
 
+struct jobid_cache_entry {
+   struct upcall_cache_entry *jce_uc_entry;
+   char  *jce_jobid;
+};
+
 struct upcall_cache_entry {
struct list_head  ue_hash;
__u64  ue_key;
@@ -92,6 +97,7 @@ struct upcall_cache_entry {
cfs_time_tue_expire;
union {
struct md_identity identity;
+   struct jobid_cache_entry jobid;
} u;
 };
 
diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h 
b/drivers/staging/lustre/lustre/include/lprocfs_status.h
index 428e3e4..3c99dcf 100644
--- a/drivers/staging/lustre/lustre/include/lprocfs_status.h
+++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h
@@ -369,6 +369,7 @@ static inline void s2dhms(struct dhms *ts, time_t secs)
 #define JOBSTATS_JOBID_VAR_MAX_LEN 20
 #define JOBSTATS_DISABLE   disable
 #define JOBSTATS_PROCNAME_UID  procname_uid
+#define JOBSTATS_NODELOCAL nodelocal
 
 extern int lprocfs_write_frac_helper(const char *buffer, unsigned long count,
 int *val, int mult);
diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c 
b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
index a2ef64c..7c48601 100644
--- a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
+++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c
@@ -140,158 +140,6 @@ int cfs_capable(cfs_cap_t cap)
return capable(cfs_cap_unpack(cap));
 }
 
-static int cfs_access_process_vm(struct task_struct *tsk, unsigned long addr,
-void *buf, int len, int write)
-{
-   /* Just copied from kernel for the kernels which doesn't
-* have access_process_vm() exported */
-   struct mm_struct *mm;
-   struct vm_area_struct *vma;
-   struct page *page;
- 

Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
On Wed, Oct 30, 2013 at 10:20 PM, Greg Kroah-Hartman
 wrote:
> On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote:
>> On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
>>  wrote:
>> > On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
>> >> so that we can get rid of cfs_get_environ() that needs 
>> >> access_process_vm() that
>> >> is a core mm function and is not available on some architectures.
>> >>
>> >> Reviewed-by: Niu Yawei 
>> >> Signed-off-by: Peng Tao 
>> >> Signed-off-by: Andreas Dilger 
>> >> ---
>> >>  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
>> >> +++-
>> >>  1 file changed, 115 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
>> >> b/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> >> index b1024a6..99ce863 100644
>> >> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> >> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> >> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
>> >>  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
>> >>  EXPORT_SYMBOL(obd_jobid_var);
>> >>
>> >> -/* Get jobid of current process by reading the environment variable
>> >> - * stored in between the "env_start" & "env_end" of task struct.
>> >> +static char *self_environ_file = "/proc/self/environ";
>> >
>> > Heh, no, that's not ok at all.
>> >
>> > This is a _huge_ sign that you are doing something wrong in your driver
>> > if you need something that isn't exported, or that you have to dig out
>> > of proc.
>> >
>> Lustre has a functionality that lets applications specify a global
>> jobid by setting the same environment variable across the cluster and
>> thus server can trace IO that is issued by different job. Currently it
>> is implemented by having a private version of cfs_access_process_vm
>> because access_process_vm is not exported. The patch intends to change
>> it to reading from proc. If we cannot dig it out from proc either, how
>> can we keep the functionality then? Please advise.
>
> Kernel code should not care about environment variables.  Just because
> you created a crazy requirement, does not mean it is a valid one, nor
> that I have to be the one to tell you how to implement it.
>
> I'm never going to take code that parses proc files, you all know better
> than that.
I see. I'll drop the patchset and keep the cfs_access_process_vm code
until better option is found.

Thanks,
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Greg Kroah-Hartman
On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote:
> On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
>  wrote:
> > On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
> >> so that we can get rid of cfs_get_environ() that needs access_process_vm() 
> >> that
> >> is a core mm function and is not available on some architectures.
> >>
> >> Reviewed-by: Niu Yawei 
> >> Signed-off-by: Peng Tao 
> >> Signed-off-by: Andreas Dilger 
> >> ---
> >>  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
> >> +++-
> >>  1 file changed, 115 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
> >> b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >> index b1024a6..99ce863 100644
> >> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> >> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
> >>  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
> >>  EXPORT_SYMBOL(obd_jobid_var);
> >>
> >> -/* Get jobid of current process by reading the environment variable
> >> - * stored in between the "env_start" & "env_end" of task struct.
> >> +static char *self_environ_file = "/proc/self/environ";
> >
> > Heh, no, that's not ok at all.
> >
> > This is a _huge_ sign that you are doing something wrong in your driver
> > if you need something that isn't exported, or that you have to dig out
> > of proc.
> >
> Lustre has a functionality that lets applications specify a global
> jobid by setting the same environment variable across the cluster and
> thus server can trace IO that is issued by different job. Currently it
> is implemented by having a private version of cfs_access_process_vm
> because access_process_vm is not exported. The patch intends to change
> it to reading from proc. If we cannot dig it out from proc either, how
> can we keep the functionality then? Please advise.

Kernel code should not care about environment variables.  Just because
you created a crazy requirement, does not mean it is a valid one, nor
that I have to be the one to tell you how to implement it.

I'm never going to take code that parses proc files, you all know better
than that.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
 wrote:
> On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
>> so that we can get rid of cfs_get_environ() that needs access_process_vm() 
>> that
>> is a core mm function and is not available on some architectures.
>>
>> Reviewed-by: Niu Yawei 
>> Signed-off-by: Peng Tao 
>> Signed-off-by: Andreas Dilger 
>> ---
>>  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
>> +++-
>>  1 file changed, 115 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
>> b/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> index b1024a6..99ce863 100644
>> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
>> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
>>  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
>>  EXPORT_SYMBOL(obd_jobid_var);
>>
>> -/* Get jobid of current process by reading the environment variable
>> - * stored in between the "env_start" & "env_end" of task struct.
>> +static char *self_environ_file = "/proc/self/environ";
>
> Heh, no, that's not ok at all.
>
> This is a _huge_ sign that you are doing something wrong in your driver
> if you need something that isn't exported, or that you have to dig out
> of proc.
>
Lustre has a functionality that lets applications specify a global
jobid by setting the same environment variable across the cluster and
thus server can trace IO that is issued by different job. Currently it
is implemented by having a private version of cfs_access_process_vm
because access_process_vm is not exported. The patch intends to change
it to reading from proc. If we cannot dig it out from proc either, how
can we keep the functionality then? Please advise.

Thanks,
Tao

> Sorry, I can't take this, please fix the underlying problems that would
> even think that you need access to the environment from within a kernel
> driver.
>
> greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Greg Kroah-Hartman
On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
> so that we can get rid of cfs_get_environ() that needs access_process_vm() 
> that
> is a core mm function and is not available on some architectures.
> 
> Reviewed-by: Niu Yawei 
> Signed-off-by: Peng Tao 
> Signed-off-by: Andreas Dilger 
> ---
>  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
> +++-
>  1 file changed, 115 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
> b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> index b1024a6..99ce863 100644
> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
>  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
>  EXPORT_SYMBOL(obd_jobid_var);
>  
> -/* Get jobid of current process by reading the environment variable
> - * stored in between the "env_start" & "env_end" of task struct.
> +static char *self_environ_file = "/proc/self/environ";

Heh, no, that's not ok at all.

This is a _huge_ sign that you are doing something wrong in your driver
if you need something that isn't exported, or that you have to dig out
of proc.

Sorry, I can't take this, please fix the underlying problems that would
even think that you need access to the environment from within a kernel
driver.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
so that we can get rid of cfs_get_environ() that needs access_process_vm() that
is a core mm function and is not available on some architectures.

Reviewed-by: Niu Yawei 
Signed-off-by: Peng Tao 
Signed-off-by: Andreas Dilger 
---
 drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 +++-
 1 file changed, 115 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index b1024a6..99ce863 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
 char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
 EXPORT_SYMBOL(obd_jobid_var);
 
-/* Get jobid of current process by reading the environment variable
- * stored in between the "env_start" & "env_end" of task struct.
+static char *self_environ_file = "/proc/self/environ";
+static int obd_get_environ(const char *key, char *value, int *val_len)
+{
+   struct mm_struct *mm;
+   struct path path;
+   struct file *filp = NULL;
+   int buf_len = PAGE_CACHE_SIZE;
+   int key_len = strlen(key);
+   char *buffer = NULL;
+   loff_t pos = 0;
+   int rc;
+
+   /*
+* test mm->mmap_sem to avoid deadlock if this ever gets called from
+* mmap code.
+*/
+   mm = get_task_mm(current);
+   if (!mm)
+   return -EINVAL;
+   if (down_read_trylock(>mmap_sem) == 0) {
+   mmput(mm);
+   return -EDEADLK;
+   }
+   up_read(>mmap_sem);
+   mmput(mm);
+
+   buffer = kmalloc(buf_len, GFP_NOIO);
+   if (!buffer)
+   return -ENOMEM;
+
+   rc = kern_path(self_environ_file, LOOKUP_FOLLOW, );
+   if (rc)
+   goto out;
+
+   filp = dentry_open(, O_RDONLY, current_cred());
+   if (IS_ERR(filp)) {
+   rc = PTR_ERR(filp);
+   filp = NULL;
+   goto out;
+   }
+
+   /* loop reading... */
+   while (1) {
+   int scan_len, this_len;
+   char *env_start, *env_end;
+   rc = kernel_read(filp, pos, buffer, buf_len);
+   if (rc <= 0)
+   break;
+
+   pos += rc;
+   /* Parse the buffer to find out the specified key/value pair.
+* The "key=value" entries are separated by '\0'. */
+   env_start = buffer;
+   scan_len = this_len = rc;
+   while (scan_len) {
+   char *entry;
+   int entry_len;
+
+   env_end = memscan(env_start, '\0', scan_len);
+   LASSERT(env_end >= env_start &&
+   env_end <= env_start + scan_len);
+
+   /* The last entry of this buffer cross the buffer
+* boundary, reread it in next cycle. */
+   if (unlikely(env_end - env_start == scan_len)) {
+   /* This entry is too large to fit in buffer */
+   if (unlikely(scan_len == this_len)) {
+   rc = -EINVAL;
+   CWARN("Environment variable '%s' too 
long: rc = %d\n",
+ key, rc);
+   goto out;
+   }
+   pos -= scan_len;
+   break;
+   }
+
+   entry = env_start;
+   entry_len = env_end - env_start;
+
+   /* Key length + length of '=' */
+   if (entry_len > key_len + 1 &&
+   !memcmp(entry, key, key_len)) {
+   entry += key_len + 1;
+   entry_len -= key_len + 1;
+   /* The 'value' buffer passed in is too small.*/
+   if (entry_len >= *val_len)
+   GOTO(out, rc = -EOVERFLOW);
+
+   memcpy(value, entry, entry_len);
+   *val_len = entry_len;
+   rc = 0;
+   goto out;
+   }
+
+   scan_len -= (env_end - env_start + 1);
+   env_start = env_end + 1;
+   }
+   }
+   if (rc >= 0)
+   rc = -ENOENT;
+out:
+   if (filp)
+   fput(filp);
+   if (buffer)
+   kfree(buffer);
+   return rc;
+}
+
+/*
+ * Get jobid of current process by reading the environment variable
+ * from /proc/self/environ.
  *
  * TODO:
  * It's better to cache the jobid for later use if there is any
@@ -126,14 +235,14 @@ int 

[PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
so that we can get rid of cfs_get_environ() that needs access_process_vm() that
is a core mm function and is not available on some architectures.

Reviewed-by: Niu Yawei yawei@intel.com
Signed-off-by: Peng Tao bergw...@gmail.com
Signed-off-by: Andreas Dilger andreas.dil...@intel.com
---
 drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 +++-
 1 file changed, 115 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
b/drivers/staging/lustre/lustre/obdclass/class_obd.c
index b1024a6..99ce863 100644
--- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
+++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
@@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
 char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
 EXPORT_SYMBOL(obd_jobid_var);
 
-/* Get jobid of current process by reading the environment variable
- * stored in between the env_start  env_end of task struct.
+static char *self_environ_file = /proc/self/environ;
+static int obd_get_environ(const char *key, char *value, int *val_len)
+{
+   struct mm_struct *mm;
+   struct path path;
+   struct file *filp = NULL;
+   int buf_len = PAGE_CACHE_SIZE;
+   int key_len = strlen(key);
+   char *buffer = NULL;
+   loff_t pos = 0;
+   int rc;
+
+   /*
+* test mm-mmap_sem to avoid deadlock if this ever gets called from
+* mmap code.
+*/
+   mm = get_task_mm(current);
+   if (!mm)
+   return -EINVAL;
+   if (down_read_trylock(mm-mmap_sem) == 0) {
+   mmput(mm);
+   return -EDEADLK;
+   }
+   up_read(mm-mmap_sem);
+   mmput(mm);
+
+   buffer = kmalloc(buf_len, GFP_NOIO);
+   if (!buffer)
+   return -ENOMEM;
+
+   rc = kern_path(self_environ_file, LOOKUP_FOLLOW, path);
+   if (rc)
+   goto out;
+
+   filp = dentry_open(path, O_RDONLY, current_cred());
+   if (IS_ERR(filp)) {
+   rc = PTR_ERR(filp);
+   filp = NULL;
+   goto out;
+   }
+
+   /* loop reading... */
+   while (1) {
+   int scan_len, this_len;
+   char *env_start, *env_end;
+   rc = kernel_read(filp, pos, buffer, buf_len);
+   if (rc = 0)
+   break;
+
+   pos += rc;
+   /* Parse the buffer to find out the specified key/value pair.
+* The key=value entries are separated by '\0'. */
+   env_start = buffer;
+   scan_len = this_len = rc;
+   while (scan_len) {
+   char *entry;
+   int entry_len;
+
+   env_end = memscan(env_start, '\0', scan_len);
+   LASSERT(env_end = env_start 
+   env_end = env_start + scan_len);
+
+   /* The last entry of this buffer cross the buffer
+* boundary, reread it in next cycle. */
+   if (unlikely(env_end - env_start == scan_len)) {
+   /* This entry is too large to fit in buffer */
+   if (unlikely(scan_len == this_len)) {
+   rc = -EINVAL;
+   CWARN(Environment variable '%s' too 
long: rc = %d\n,
+ key, rc);
+   goto out;
+   }
+   pos -= scan_len;
+   break;
+   }
+
+   entry = env_start;
+   entry_len = env_end - env_start;
+
+   /* Key length + length of '=' */
+   if (entry_len  key_len + 1 
+   !memcmp(entry, key, key_len)) {
+   entry += key_len + 1;
+   entry_len -= key_len + 1;
+   /* The 'value' buffer passed in is too small.*/
+   if (entry_len = *val_len)
+   GOTO(out, rc = -EOVERFLOW);
+
+   memcpy(value, entry, entry_len);
+   *val_len = entry_len;
+   rc = 0;
+   goto out;
+   }
+
+   scan_len -= (env_end - env_start + 1);
+   env_start = env_end + 1;
+   }
+   }
+   if (rc = 0)
+   rc = -ENOENT;
+out:
+   if (filp)
+   fput(filp);
+   if (buffer)
+   kfree(buffer);
+   return rc;
+}
+
+/*
+ * Get jobid of current process by reading the environment variable
+ * from /proc/self/environ.
  *
  * TODO:
  * It's better to cache the jobid for later use 

Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Greg Kroah-Hartman
On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
 so that we can get rid of cfs_get_environ() that needs access_process_vm() 
 that
 is a core mm function and is not available on some architectures.
 
 Reviewed-by: Niu Yawei yawei@intel.com
 Signed-off-by: Peng Tao bergw...@gmail.com
 Signed-off-by: Andreas Dilger andreas.dil...@intel.com
 ---
  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
 +++-
  1 file changed, 115 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
 b/drivers/staging/lustre/lustre/obdclass/class_obd.c
 index b1024a6..99ce863 100644
 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
 +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
 @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
  EXPORT_SYMBOL(obd_jobid_var);
  
 -/* Get jobid of current process by reading the environment variable
 - * stored in between the env_start  env_end of task struct.
 +static char *self_environ_file = /proc/self/environ;

Heh, no, that's not ok at all.

This is a _huge_ sign that you are doing something wrong in your driver
if you need something that isn't exported, or that you have to dig out
of proc.

Sorry, I can't take this, please fix the underlying problems that would
even think that you need access to the environment from within a kernel
driver.

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
gre...@linuxfoundation.org wrote:
 On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
 so that we can get rid of cfs_get_environ() that needs access_process_vm() 
 that
 is a core mm function and is not available on some architectures.

 Reviewed-by: Niu Yawei yawei@intel.com
 Signed-off-by: Peng Tao bergw...@gmail.com
 Signed-off-by: Andreas Dilger andreas.dil...@intel.com
 ---
  drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
 +++-
  1 file changed, 115 insertions(+), 4 deletions(-)

 diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
 b/drivers/staging/lustre/lustre/obdclass/class_obd.c
 index b1024a6..99ce863 100644
 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
 +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
 @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
  char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
  EXPORT_SYMBOL(obd_jobid_var);

 -/* Get jobid of current process by reading the environment variable
 - * stored in between the env_start  env_end of task struct.
 +static char *self_environ_file = /proc/self/environ;

 Heh, no, that's not ok at all.

 This is a _huge_ sign that you are doing something wrong in your driver
 if you need something that isn't exported, or that you have to dig out
 of proc.

Lustre has a functionality that lets applications specify a global
jobid by setting the same environment variable across the cluster and
thus server can trace IO that is issued by different job. Currently it
is implemented by having a private version of cfs_access_process_vm
because access_process_vm is not exported. The patch intends to change
it to reading from proc. If we cannot dig it out from proc either, how
can we keep the functionality then? Please advise.

Thanks,
Tao

 Sorry, I can't take this, please fix the underlying problems that would
 even think that you need access to the environment from within a kernel
 driver.

 greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Greg Kroah-Hartman
On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote:
 On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
 gre...@linuxfoundation.org wrote:
  On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
  so that we can get rid of cfs_get_environ() that needs access_process_vm() 
  that
  is a core mm function and is not available on some architectures.
 
  Reviewed-by: Niu Yawei yawei@intel.com
  Signed-off-by: Peng Tao bergw...@gmail.com
  Signed-off-by: Andreas Dilger andreas.dil...@intel.com
  ---
   drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
  +++-
   1 file changed, 115 insertions(+), 4 deletions(-)
 
  diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
  b/drivers/staging/lustre/lustre/obdclass/class_obd.c
  index b1024a6..99ce863 100644
  --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
  +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
  @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
   char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
   EXPORT_SYMBOL(obd_jobid_var);
 
  -/* Get jobid of current process by reading the environment variable
  - * stored in between the env_start  env_end of task struct.
  +static char *self_environ_file = /proc/self/environ;
 
  Heh, no, that's not ok at all.
 
  This is a _huge_ sign that you are doing something wrong in your driver
  if you need something that isn't exported, or that you have to dig out
  of proc.
 
 Lustre has a functionality that lets applications specify a global
 jobid by setting the same environment variable across the cluster and
 thus server can trace IO that is issued by different job. Currently it
 is implemented by having a private version of cfs_access_process_vm
 because access_process_vm is not exported. The patch intends to change
 it to reading from proc. If we cannot dig it out from proc either, how
 can we keep the functionality then? Please advise.

Kernel code should not care about environment variables.  Just because
you created a crazy requirement, does not mean it is a valid one, nor
that I have to be the one to tell you how to implement it.

I'm never going to take code that parses proc files, you all know better
than that.

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc

2013-10-30 Thread Peng Tao
On Wed, Oct 30, 2013 at 10:20 PM, Greg Kroah-Hartman
gre...@linuxfoundation.org wrote:
 On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote:
 On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman
 gre...@linuxfoundation.org wrote:
  On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote:
  so that we can get rid of cfs_get_environ() that needs 
  access_process_vm() that
  is a core mm function and is not available on some architectures.
 
  Reviewed-by: Niu Yawei yawei@intel.com
  Signed-off-by: Peng Tao bergw...@gmail.com
  Signed-off-by: Andreas Dilger andreas.dil...@intel.com
  ---
   drivers/staging/lustre/lustre/obdclass/class_obd.c |  119 
  +++-
   1 file changed, 115 insertions(+), 4 deletions(-)
 
  diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c 
  b/drivers/staging/lustre/lustre/obdclass/class_obd.c
  index b1024a6..99ce863 100644
  --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c
  +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c
  @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages);
   char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE;
   EXPORT_SYMBOL(obd_jobid_var);
 
  -/* Get jobid of current process by reading the environment variable
  - * stored in between the env_start  env_end of task struct.
  +static char *self_environ_file = /proc/self/environ;
 
  Heh, no, that's not ok at all.
 
  This is a _huge_ sign that you are doing something wrong in your driver
  if you need something that isn't exported, or that you have to dig out
  of proc.
 
 Lustre has a functionality that lets applications specify a global
 jobid by setting the same environment variable across the cluster and
 thus server can trace IO that is issued by different job. Currently it
 is implemented by having a private version of cfs_access_process_vm
 because access_process_vm is not exported. The patch intends to change
 it to reading from proc. If we cannot dig it out from proc either, how
 can we keep the functionality then? Please advise.

 Kernel code should not care about environment variables.  Just because
 you created a crazy requirement, does not mean it is a valid one, nor
 that I have to be the one to tell you how to implement it.

 I'm never going to take code that parses proc files, you all know better
 than that.
I see. I'll drop the patchset and keep the cfs_access_process_vm code
until better option is found.

Thanks,
Tao
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/