Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
Hello! On Feb 4, 2014, at 11:57 AM, Greg Kroah-Hartman wrote: > What exactly are you doing here? Calling out to userspace for what > information? And how are you going to handle namespaces and containers > by doing that? Are you going to block in the kernel for this > information? > > What are you trying to solve with this code in the first place? So, as an overview of a feature: When you have tens of thousand (and even hundreds) of nodes doing IO, it's no longer practical to tell them apart separately or in some network-related groups based on their address server-side (for purposes like monitoring and load control). Since such systems are usually managed by some sort of a job/batch scheduler, it's much more natural to organize them into "jobs" as known to the job scheduler instead. This job id information is harvested and added to all RPCs so that server side can do useful things with this information (like aggregate statistics, identify pathologically bad workcases, QoS and so on). Most of batch schedulers out there let userspace know their own JOBID as an environment variable. So original implementation was just harvesting this info directly from process environment. I certainly can see why this is not really desirable. So, the patch does away with the environment parsing, instead it adds two new venues of getting this information: 1. In vast majorities of cases entire node is dedicated to a single job, so we just create a /proc variable where you can input job id from job scheduler prologue (and then clear it from an epilogue). Getting jobid in this case does does not dip in userspace anymore. This also does not block anywhere. 2. In some more rare systems with lots of cores they actually seem to be subdividing individual nodes across jobs. Additionally all systems usually have login/interactive nodes. While these sort of interactive nodes do not have jobs scheduled on them, it still might be useful to distinguish between different user sessions happening there. This is where the upcalls come into play. First time a process does IO an upcall would be called that would provide the kernel with jobid identifier (however it might want to obtain it, we don't really care at this point). This would block (with a timeout) for the reply. The reply is then cached and reused for subsequent io from the same process. I did not really think about containers before, but I think it would work out anyway: I think namespaces and containers still have non-intersecting pid space so we should be fine in this regard. There is a "master container"/namespace of some sort I think, that is able to see entire pid space, and that's where the upcall would be run (do I need to somehow force that?) Bye, Oleg -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Tue, Feb 04, 2014 at 10:12:10AM +0400, Oleg Drokin wrote: > Hello! > > On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote: > > > - * stored in between the "env_start" & "env_end" of task struct. > > > +static char *self_environ_file = "/proc/self/environ"; > > > > Heh, no, that's not ok at all. > > > > This is a _huge_ sign that you are doing something wrong in your driver > > if you need something that isn't exported, or that you have to dig out > > of proc. > > > > Sorry, I can't take this, please fix the underlying problems that would > > even think that you need access to the environment from within a kernel > > driver. > > I took a stab at this. > This is not a final patch, I know there's still some number of checkpatch > warnings and the proc layout is not finalized yet for example. > > But before I spend any more time in polishing this, can you please take a look > and advise if this direction would be acceptable for you when driven to > completion? What exactly are you doing here? Calling out to userspace for what information? And how are you going to handle namespaces and containers by doing that? Are you going to block in the kernel for this information? What are you trying to solve with this code in the first place? greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Tue, Feb 04, 2014 at 10:12:10AM +0400, Oleg Drokin wrote: Hello! On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote: - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Sorry, I can't take this, please fix the underlying problems that would even think that you need access to the environment from within a kernel driver. I took a stab at this. This is not a final patch, I know there's still some number of checkpatch warnings and the proc layout is not finalized yet for example. But before I spend any more time in polishing this, can you please take a look and advise if this direction would be acceptable for you when driven to completion? What exactly are you doing here? Calling out to userspace for what information? And how are you going to handle namespaces and containers by doing that? Are you going to block in the kernel for this information? What are you trying to solve with this code in the first place? greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
Hello! On Feb 4, 2014, at 11:57 AM, Greg Kroah-Hartman wrote: What exactly are you doing here? Calling out to userspace for what information? And how are you going to handle namespaces and containers by doing that? Are you going to block in the kernel for this information? What are you trying to solve with this code in the first place? So, as an overview of a feature: When you have tens of thousand (and even hundreds) of nodes doing IO, it's no longer practical to tell them apart separately or in some network-related groups based on their address server-side (for purposes like monitoring and load control). Since such systems are usually managed by some sort of a job/batch scheduler, it's much more natural to organize them into jobs as known to the job scheduler instead. This job id information is harvested and added to all RPCs so that server side can do useful things with this information (like aggregate statistics, identify pathologically bad workcases, QoS and so on). Most of batch schedulers out there let userspace know their own JOBID as an environment variable. So original implementation was just harvesting this info directly from process environment. I certainly can see why this is not really desirable. So, the patch does away with the environment parsing, instead it adds two new venues of getting this information: 1. In vast majorities of cases entire node is dedicated to a single job, so we just create a /proc variable where you can input job id from job scheduler prologue (and then clear it from an epilogue). Getting jobid in this case does does not dip in userspace anymore. This also does not block anywhere. 2. In some more rare systems with lots of cores they actually seem to be subdividing individual nodes across jobs. Additionally all systems usually have login/interactive nodes. While these sort of interactive nodes do not have jobs scheduled on them, it still might be useful to distinguish between different user sessions happening there. This is where the upcalls come into play. First time a process does IO an upcall would be called that would provide the kernel with jobid identifier (however it might want to obtain it, we don't really care at this point). This would block (with a timeout) for the reply. The reply is then cached and reused for subsequent io from the same process. I did not really think about containers before, but I think it would work out anyway: I think namespaces and containers still have non-intersecting pid space so we should be fine in this regard. There is a master container/namespace of some sort I think, that is able to see entire pid space, and that's where the upcall would be run (do I need to somehow force that?) Bye, Oleg -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
Hello! On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote: > > - * stored in between the "env_start" & "env_end" of task struct. > > +static char *self_environ_file = "/proc/self/environ"; > > Heh, no, that's not ok at all. > > This is a _huge_ sign that you are doing something wrong in your driver > if you need something that isn't exported, or that you have to dig out > of proc. > > Sorry, I can't take this, please fix the underlying problems that would > even think that you need access to the environment from within a kernel > driver. I took a stab at this. This is not a final patch, I know there's still some number of checkpatch warnings and the proc layout is not finalized yet for example. But before I spend any more time in polishing this, can you please take a look and advise if this direction would be acceptable for you when driven to completion? Thanks. >From 6a5b58657cc32163738d4a8c210e8683159b582f Mon Sep 17 00:00:00 2001 From: Oleg Drokin Date: Tue, 4 Feb 2014 00:32:12 -0500 Subject: [PATCH] staging/lustre: Obtain jobid invormation via upcall Replace lustre jobid information fetching directly from process env variable with either node-wide jobid obtained via a proc file, or through an upcall that would provide the jobid if more fine-grained operations are necessary. Signed-off-by: Oleg Drokin --- .../staging/lustre/include/linux/libcfs/curproc.h | 1 - .../staging/lustre/include/linux/libcfs/lucache.h | 6 + .../staging/lustre/lustre/include/lprocfs_status.h | 1 + .../lustre/lustre/libcfs/linux/linux-curproc.c | 152 - drivers/staging/lustre/lustre/obdclass/Makefile| 2 +- drivers/staging/lustre/lustre/obdclass/class_obd.c | 67 .../staging/lustre/lustre/obdclass/jobid_cache.c | 181 + .../lustre/lustre/obdclass/jobid_internal.h| 10 ++ .../lustre/lustre/obdclass/linux/linux-module.c| 114 + 9 files changed, 345 insertions(+), 189 deletions(-) create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_cache.c create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_internal.h diff --git a/drivers/staging/lustre/include/linux/libcfs/curproc.h b/drivers/staging/lustre/include/linux/libcfs/curproc.h index 507d16b..cf1f26b 100644 --- a/drivers/staging/lustre/include/linux/libcfs/curproc.h +++ b/drivers/staging/lustre/include/linux/libcfs/curproc.h @@ -63,7 +63,6 @@ intcfs_curproc_groups_nr(void); /* check if task is running in compat mode.*/ #define current_pid() (current->pid) #define current_comm() (current->comm) -int cfs_get_environ(const char *key, char *value, int *val_len); typedef __u32 cfs_cap_t; diff --git a/drivers/staging/lustre/include/linux/libcfs/lucache.h b/drivers/staging/lustre/include/linux/libcfs/lucache.h index 9668b39..f8361d7 100644 --- a/drivers/staging/lustre/include/linux/libcfs/lucache.h +++ b/drivers/staging/lustre/include/linux/libcfs/lucache.h @@ -82,6 +82,11 @@ struct md_identity { struct md_perm *mi_perms; }; +struct jobid_cache_entry { + struct upcall_cache_entry *jce_uc_entry; + char *jce_jobid; +}; + struct upcall_cache_entry { struct list_head ue_hash; __u64 ue_key; @@ -92,6 +97,7 @@ struct upcall_cache_entry { cfs_time_tue_expire; union { struct md_identity identity; + struct jobid_cache_entry jobid; } u; }; diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h b/drivers/staging/lustre/lustre/include/lprocfs_status.h index 428e3e4..3c99dcf 100644 --- a/drivers/staging/lustre/lustre/include/lprocfs_status.h +++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h @@ -369,6 +369,7 @@ static inline void s2dhms(struct dhms *ts, time_t secs) #define JOBSTATS_JOBID_VAR_MAX_LEN 20 #define JOBSTATS_DISABLE "disable" #define JOBSTATS_PROCNAME_UID "procname_uid" +#define JOBSTATS_NODELOCAL "nodelocal" extern int lprocfs_write_frac_helper(const char *buffer, unsigned long count, int *val, int mult); diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c index a2ef64c..7c48601 100644 --- a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c +++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c @@ -140,158 +140,6 @@ int cfs_capable(cfs_cap_t cap) return capable(cfs_cap_unpack(cap)); } -static int cfs_access_process_vm(struct task_struct *tsk, unsigned long addr, -void *buf, int len, int write) -{ - /* Just copied from kernel for the kernels which doesn't -* have access_process_vm() exported */ - struct mm_struct *mm; - struct vm_area_struct *vma; - struct page *page; - void
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
Hello! On Wed, Oct 30, 2013 at 06:21:01AM -0700, Greg Kroah-Hartman wrote: - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Sorry, I can't take this, please fix the underlying problems that would even think that you need access to the environment from within a kernel driver. I took a stab at this. This is not a final patch, I know there's still some number of checkpatch warnings and the proc layout is not finalized yet for example. But before I spend any more time in polishing this, can you please take a look and advise if this direction would be acceptable for you when driven to completion? Thanks. From 6a5b58657cc32163738d4a8c210e8683159b582f Mon Sep 17 00:00:00 2001 From: Oleg Drokin gr...@linuxhacker.ru Date: Tue, 4 Feb 2014 00:32:12 -0500 Subject: [PATCH] staging/lustre: Obtain jobid invormation via upcall Replace lustre jobid information fetching directly from process env variable with either node-wide jobid obtained via a proc file, or through an upcall that would provide the jobid if more fine-grained operations are necessary. Signed-off-by: Oleg Drokin oleg.dro...@intel.com --- .../staging/lustre/include/linux/libcfs/curproc.h | 1 - .../staging/lustre/include/linux/libcfs/lucache.h | 6 + .../staging/lustre/lustre/include/lprocfs_status.h | 1 + .../lustre/lustre/libcfs/linux/linux-curproc.c | 152 - drivers/staging/lustre/lustre/obdclass/Makefile| 2 +- drivers/staging/lustre/lustre/obdclass/class_obd.c | 67 .../staging/lustre/lustre/obdclass/jobid_cache.c | 181 + .../lustre/lustre/obdclass/jobid_internal.h| 10 ++ .../lustre/lustre/obdclass/linux/linux-module.c| 114 + 9 files changed, 345 insertions(+), 189 deletions(-) create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_cache.c create mode 100644 drivers/staging/lustre/lustre/obdclass/jobid_internal.h diff --git a/drivers/staging/lustre/include/linux/libcfs/curproc.h b/drivers/staging/lustre/include/linux/libcfs/curproc.h index 507d16b..cf1f26b 100644 --- a/drivers/staging/lustre/include/linux/libcfs/curproc.h +++ b/drivers/staging/lustre/include/linux/libcfs/curproc.h @@ -63,7 +63,6 @@ intcfs_curproc_groups_nr(void); /* check if task is running in compat mode.*/ #define current_pid() (current-pid) #define current_comm() (current-comm) -int cfs_get_environ(const char *key, char *value, int *val_len); typedef __u32 cfs_cap_t; diff --git a/drivers/staging/lustre/include/linux/libcfs/lucache.h b/drivers/staging/lustre/include/linux/libcfs/lucache.h index 9668b39..f8361d7 100644 --- a/drivers/staging/lustre/include/linux/libcfs/lucache.h +++ b/drivers/staging/lustre/include/linux/libcfs/lucache.h @@ -82,6 +82,11 @@ struct md_identity { struct md_perm *mi_perms; }; +struct jobid_cache_entry { + struct upcall_cache_entry *jce_uc_entry; + char *jce_jobid; +}; + struct upcall_cache_entry { struct list_head ue_hash; __u64 ue_key; @@ -92,6 +97,7 @@ struct upcall_cache_entry { cfs_time_tue_expire; union { struct md_identity identity; + struct jobid_cache_entry jobid; } u; }; diff --git a/drivers/staging/lustre/lustre/include/lprocfs_status.h b/drivers/staging/lustre/lustre/include/lprocfs_status.h index 428e3e4..3c99dcf 100644 --- a/drivers/staging/lustre/lustre/include/lprocfs_status.h +++ b/drivers/staging/lustre/lustre/include/lprocfs_status.h @@ -369,6 +369,7 @@ static inline void s2dhms(struct dhms *ts, time_t secs) #define JOBSTATS_JOBID_VAR_MAX_LEN 20 #define JOBSTATS_DISABLE disable #define JOBSTATS_PROCNAME_UID procname_uid +#define JOBSTATS_NODELOCAL nodelocal extern int lprocfs_write_frac_helper(const char *buffer, unsigned long count, int *val, int mult); diff --git a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c index a2ef64c..7c48601 100644 --- a/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c +++ b/drivers/staging/lustre/lustre/libcfs/linux/linux-curproc.c @@ -140,158 +140,6 @@ int cfs_capable(cfs_cap_t cap) return capable(cfs_cap_unpack(cap)); } -static int cfs_access_process_vm(struct task_struct *tsk, unsigned long addr, -void *buf, int len, int write) -{ - /* Just copied from kernel for the kernels which doesn't -* have access_process_vm() exported */ - struct mm_struct *mm; - struct vm_area_struct *vma; - struct page *page; -
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 10:20 PM, Greg Kroah-Hartman wrote: > On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote: >> On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman >> wrote: >> > On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: >> >> so that we can get rid of cfs_get_environ() that needs >> >> access_process_vm() that >> >> is a core mm function and is not available on some architectures. >> >> >> >> Reviewed-by: Niu Yawei >> >> Signed-off-by: Peng Tao >> >> Signed-off-by: Andreas Dilger >> >> --- >> >> drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 >> >> +++- >> >> 1 file changed, 115 insertions(+), 4 deletions(-) >> >> >> >> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c >> >> b/drivers/staging/lustre/lustre/obdclass/class_obd.c >> >> index b1024a6..99ce863 100644 >> >> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c >> >> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c >> >> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); >> >> char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; >> >> EXPORT_SYMBOL(obd_jobid_var); >> >> >> >> -/* Get jobid of current process by reading the environment variable >> >> - * stored in between the "env_start" & "env_end" of task struct. >> >> +static char *self_environ_file = "/proc/self/environ"; >> > >> > Heh, no, that's not ok at all. >> > >> > This is a _huge_ sign that you are doing something wrong in your driver >> > if you need something that isn't exported, or that you have to dig out >> > of proc. >> > >> Lustre has a functionality that lets applications specify a global >> jobid by setting the same environment variable across the cluster and >> thus server can trace IO that is issued by different job. Currently it >> is implemented by having a private version of cfs_access_process_vm >> because access_process_vm is not exported. The patch intends to change >> it to reading from proc. If we cannot dig it out from proc either, how >> can we keep the functionality then? Please advise. > > Kernel code should not care about environment variables. Just because > you created a crazy requirement, does not mean it is a valid one, nor > that I have to be the one to tell you how to implement it. > > I'm never going to take code that parses proc files, you all know better > than that. I see. I'll drop the patchset and keep the cfs_access_process_vm code until better option is found. Thanks, Tao -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote: > On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman > wrote: > > On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: > >> so that we can get rid of cfs_get_environ() that needs access_process_vm() > >> that > >> is a core mm function and is not available on some architectures. > >> > >> Reviewed-by: Niu Yawei > >> Signed-off-by: Peng Tao > >> Signed-off-by: Andreas Dilger > >> --- > >> drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 > >> +++- > >> 1 file changed, 115 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c > >> b/drivers/staging/lustre/lustre/obdclass/class_obd.c > >> index b1024a6..99ce863 100644 > >> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c > >> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c > >> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); > >> char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; > >> EXPORT_SYMBOL(obd_jobid_var); > >> > >> -/* Get jobid of current process by reading the environment variable > >> - * stored in between the "env_start" & "env_end" of task struct. > >> +static char *self_environ_file = "/proc/self/environ"; > > > > Heh, no, that's not ok at all. > > > > This is a _huge_ sign that you are doing something wrong in your driver > > if you need something that isn't exported, or that you have to dig out > > of proc. > > > Lustre has a functionality that lets applications specify a global > jobid by setting the same environment variable across the cluster and > thus server can trace IO that is issued by different job. Currently it > is implemented by having a private version of cfs_access_process_vm > because access_process_vm is not exported. The patch intends to change > it to reading from proc. If we cannot dig it out from proc either, how > can we keep the functionality then? Please advise. Kernel code should not care about environment variables. Just because you created a crazy requirement, does not mean it is a valid one, nor that I have to be the one to tell you how to implement it. I'm never going to take code that parses proc files, you all know better than that. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman wrote: > On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: >> so that we can get rid of cfs_get_environ() that needs access_process_vm() >> that >> is a core mm function and is not available on some architectures. >> >> Reviewed-by: Niu Yawei >> Signed-off-by: Peng Tao >> Signed-off-by: Andreas Dilger >> --- >> drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 >> +++- >> 1 file changed, 115 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c >> b/drivers/staging/lustre/lustre/obdclass/class_obd.c >> index b1024a6..99ce863 100644 >> --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c >> +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c >> @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); >> char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; >> EXPORT_SYMBOL(obd_jobid_var); >> >> -/* Get jobid of current process by reading the environment variable >> - * stored in between the "env_start" & "env_end" of task struct. >> +static char *self_environ_file = "/proc/self/environ"; > > Heh, no, that's not ok at all. > > This is a _huge_ sign that you are doing something wrong in your driver > if you need something that isn't exported, or that you have to dig out > of proc. > Lustre has a functionality that lets applications specify a global jobid by setting the same environment variable across the cluster and thus server can trace IO that is issued by different job. Currently it is implemented by having a private version of cfs_access_process_vm because access_process_vm is not exported. The patch intends to change it to reading from proc. If we cannot dig it out from proc either, how can we keep the functionality then? Please advise. Thanks, Tao > Sorry, I can't take this, please fix the underlying problems that would > even think that you need access to the environment from within a kernel > driver. > > greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: > so that we can get rid of cfs_get_environ() that needs access_process_vm() > that > is a core mm function and is not available on some architectures. > > Reviewed-by: Niu Yawei > Signed-off-by: Peng Tao > Signed-off-by: Andreas Dilger > --- > drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 > +++- > 1 file changed, 115 insertions(+), 4 deletions(-) > > diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c > b/drivers/staging/lustre/lustre/obdclass/class_obd.c > index b1024a6..99ce863 100644 > --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c > +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c > @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); > char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; > EXPORT_SYMBOL(obd_jobid_var); > > -/* Get jobid of current process by reading the environment variable > - * stored in between the "env_start" & "env_end" of task struct. > +static char *self_environ_file = "/proc/self/environ"; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Sorry, I can't take this, please fix the underlying problems that would even think that you need access to the environment from within a kernel driver. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/4] staging/lustre/obdclass: read jobid from proc
so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei Signed-off-by: Peng Tao Signed-off-by: Andreas Dilger --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the "env_start" & "env_end" of task struct. +static char *self_environ_file = "/proc/self/environ"; +static int obd_get_environ(const char *key, char *value, int *val_len) +{ + struct mm_struct *mm; + struct path path; + struct file *filp = NULL; + int buf_len = PAGE_CACHE_SIZE; + int key_len = strlen(key); + char *buffer = NULL; + loff_t pos = 0; + int rc; + + /* +* test mm->mmap_sem to avoid deadlock if this ever gets called from +* mmap code. +*/ + mm = get_task_mm(current); + if (!mm) + return -EINVAL; + if (down_read_trylock(>mmap_sem) == 0) { + mmput(mm); + return -EDEADLK; + } + up_read(>mmap_sem); + mmput(mm); + + buffer = kmalloc(buf_len, GFP_NOIO); + if (!buffer) + return -ENOMEM; + + rc = kern_path(self_environ_file, LOOKUP_FOLLOW, ); + if (rc) + goto out; + + filp = dentry_open(, O_RDONLY, current_cred()); + if (IS_ERR(filp)) { + rc = PTR_ERR(filp); + filp = NULL; + goto out; + } + + /* loop reading... */ + while (1) { + int scan_len, this_len; + char *env_start, *env_end; + rc = kernel_read(filp, pos, buffer, buf_len); + if (rc <= 0) + break; + + pos += rc; + /* Parse the buffer to find out the specified key/value pair. +* The "key=value" entries are separated by '\0'. */ + env_start = buffer; + scan_len = this_len = rc; + while (scan_len) { + char *entry; + int entry_len; + + env_end = memscan(env_start, '\0', scan_len); + LASSERT(env_end >= env_start && + env_end <= env_start + scan_len); + + /* The last entry of this buffer cross the buffer +* boundary, reread it in next cycle. */ + if (unlikely(env_end - env_start == scan_len)) { + /* This entry is too large to fit in buffer */ + if (unlikely(scan_len == this_len)) { + rc = -EINVAL; + CWARN("Environment variable '%s' too long: rc = %d\n", + key, rc); + goto out; + } + pos -= scan_len; + break; + } + + entry = env_start; + entry_len = env_end - env_start; + + /* Key length + length of '=' */ + if (entry_len > key_len + 1 && + !memcmp(entry, key, key_len)) { + entry += key_len + 1; + entry_len -= key_len + 1; + /* The 'value' buffer passed in is too small.*/ + if (entry_len >= *val_len) + GOTO(out, rc = -EOVERFLOW); + + memcpy(value, entry, entry_len); + *val_len = entry_len; + rc = 0; + goto out; + } + + scan_len -= (env_end - env_start + 1); + env_start = env_end + 1; + } + } + if (rc >= 0) + rc = -ENOENT; +out: + if (filp) + fput(filp); + if (buffer) + kfree(buffer); + return rc; +} + +/* + * Get jobid of current process by reading the environment variable + * from /proc/self/environ. * * TODO: * It's better to cache the jobid for later use if there is any @@ -126,14 +235,14 @@ int
[PATCH 2/4] staging/lustre/obdclass: read jobid from proc
so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei yawei@intel.com Signed-off-by: Peng Tao bergw...@gmail.com Signed-off-by: Andreas Dilger andreas.dil...@intel.com --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; +static int obd_get_environ(const char *key, char *value, int *val_len) +{ + struct mm_struct *mm; + struct path path; + struct file *filp = NULL; + int buf_len = PAGE_CACHE_SIZE; + int key_len = strlen(key); + char *buffer = NULL; + loff_t pos = 0; + int rc; + + /* +* test mm-mmap_sem to avoid deadlock if this ever gets called from +* mmap code. +*/ + mm = get_task_mm(current); + if (!mm) + return -EINVAL; + if (down_read_trylock(mm-mmap_sem) == 0) { + mmput(mm); + return -EDEADLK; + } + up_read(mm-mmap_sem); + mmput(mm); + + buffer = kmalloc(buf_len, GFP_NOIO); + if (!buffer) + return -ENOMEM; + + rc = kern_path(self_environ_file, LOOKUP_FOLLOW, path); + if (rc) + goto out; + + filp = dentry_open(path, O_RDONLY, current_cred()); + if (IS_ERR(filp)) { + rc = PTR_ERR(filp); + filp = NULL; + goto out; + } + + /* loop reading... */ + while (1) { + int scan_len, this_len; + char *env_start, *env_end; + rc = kernel_read(filp, pos, buffer, buf_len); + if (rc = 0) + break; + + pos += rc; + /* Parse the buffer to find out the specified key/value pair. +* The key=value entries are separated by '\0'. */ + env_start = buffer; + scan_len = this_len = rc; + while (scan_len) { + char *entry; + int entry_len; + + env_end = memscan(env_start, '\0', scan_len); + LASSERT(env_end = env_start + env_end = env_start + scan_len); + + /* The last entry of this buffer cross the buffer +* boundary, reread it in next cycle. */ + if (unlikely(env_end - env_start == scan_len)) { + /* This entry is too large to fit in buffer */ + if (unlikely(scan_len == this_len)) { + rc = -EINVAL; + CWARN(Environment variable '%s' too long: rc = %d\n, + key, rc); + goto out; + } + pos -= scan_len; + break; + } + + entry = env_start; + entry_len = env_end - env_start; + + /* Key length + length of '=' */ + if (entry_len key_len + 1 + !memcmp(entry, key, key_len)) { + entry += key_len + 1; + entry_len -= key_len + 1; + /* The 'value' buffer passed in is too small.*/ + if (entry_len = *val_len) + GOTO(out, rc = -EOVERFLOW); + + memcpy(value, entry, entry_len); + *val_len = entry_len; + rc = 0; + goto out; + } + + scan_len -= (env_end - env_start + 1); + env_start = env_end + 1; + } + } + if (rc = 0) + rc = -ENOENT; +out: + if (filp) + fput(filp); + if (buffer) + kfree(buffer); + return rc; +} + +/* + * Get jobid of current process by reading the environment variable + * from /proc/self/environ. * * TODO: * It's better to cache the jobid for later use
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei yawei@intel.com Signed-off-by: Peng Tao bergw...@gmail.com Signed-off-by: Andreas Dilger andreas.dil...@intel.com --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Sorry, I can't take this, please fix the underlying problems that would even think that you need access to the environment from within a kernel driver. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei yawei@intel.com Signed-off-by: Peng Tao bergw...@gmail.com Signed-off-by: Andreas Dilger andreas.dil...@intel.com --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Lustre has a functionality that lets applications specify a global jobid by setting the same environment variable across the cluster and thus server can trace IO that is issued by different job. Currently it is implemented by having a private version of cfs_access_process_vm because access_process_vm is not exported. The patch intends to change it to reading from proc. If we cannot dig it out from proc either, how can we keep the functionality then? Please advise. Thanks, Tao Sorry, I can't take this, please fix the underlying problems that would even think that you need access to the environment from within a kernel driver. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote: On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei yawei@intel.com Signed-off-by: Peng Tao bergw...@gmail.com Signed-off-by: Andreas Dilger andreas.dil...@intel.com --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Lustre has a functionality that lets applications specify a global jobid by setting the same environment variable across the cluster and thus server can trace IO that is issued by different job. Currently it is implemented by having a private version of cfs_access_process_vm because access_process_vm is not exported. The patch intends to change it to reading from proc. If we cannot dig it out from proc either, how can we keep the functionality then? Please advise. Kernel code should not care about environment variables. Just because you created a crazy requirement, does not mean it is a valid one, nor that I have to be the one to tell you how to implement it. I'm never going to take code that parses proc files, you all know better than that. greg k-h -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] staging/lustre/obdclass: read jobid from proc
On Wed, Oct 30, 2013 at 10:20 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Wed, Oct 30, 2013 at 09:31:24PM +0800, Peng Tao wrote: On Wed, Oct 30, 2013 at 9:21 PM, Greg Kroah-Hartman gre...@linuxfoundation.org wrote: On Wed, Oct 30, 2013 at 07:30:34PM +0800, Peng Tao wrote: so that we can get rid of cfs_get_environ() that needs access_process_vm() that is a core mm function and is not available on some architectures. Reviewed-by: Niu Yawei yawei@intel.com Signed-off-by: Peng Tao bergw...@gmail.com Signed-off-by: Andreas Dilger andreas.dil...@intel.com --- drivers/staging/lustre/lustre/obdclass/class_obd.c | 119 +++- 1 file changed, 115 insertions(+), 4 deletions(-) diff --git a/drivers/staging/lustre/lustre/obdclass/class_obd.c b/drivers/staging/lustre/lustre/obdclass/class_obd.c index b1024a6..99ce863 100644 --- a/drivers/staging/lustre/lustre/obdclass/class_obd.c +++ b/drivers/staging/lustre/lustre/obdclass/class_obd.c @@ -96,8 +96,117 @@ EXPORT_SYMBOL(obd_dirty_transit_pages); char obd_jobid_var[JOBSTATS_JOBID_VAR_MAX_LEN + 1] = JOBSTATS_DISABLE; EXPORT_SYMBOL(obd_jobid_var); -/* Get jobid of current process by reading the environment variable - * stored in between the env_start env_end of task struct. +static char *self_environ_file = /proc/self/environ; Heh, no, that's not ok at all. This is a _huge_ sign that you are doing something wrong in your driver if you need something that isn't exported, or that you have to dig out of proc. Lustre has a functionality that lets applications specify a global jobid by setting the same environment variable across the cluster and thus server can trace IO that is issued by different job. Currently it is implemented by having a private version of cfs_access_process_vm because access_process_vm is not exported. The patch intends to change it to reading from proc. If we cannot dig it out from proc either, how can we keep the functionality then? Please advise. Kernel code should not care about environment variables. Just because you created a crazy requirement, does not mean it is a valid one, nor that I have to be the one to tell you how to implement it. I'm never going to take code that parses proc files, you all know better than that. I see. I'll drop the patchset and keep the cfs_access_process_vm code until better option is found. Thanks, Tao -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/