Re: [R] How to import sensitive data when multiple users collaborate on R-script?
There are lots of ways to handle this kind of thing, and the other suggestions are good. But specific to your "something like" idea, see the output of Sys.info() in particular Sys.info()['nodename'] Sys.info()['user'] -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 5/31/16, 3:44 AM, "R-help on behalf of Nikolai Stenfors" wrote: >We conduct medical research and our datafiles therefore contain sensitive >data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, >GitHub). >When we collaborate on a r-analysis-script, we stumble upon the following >annoyance. Researcher 1 has a line in the script importing the sensitive >data from his/her personal computer. Researcher 2 has to put an additional >line importing the data from his/her personal computer. Thus, we have >lines >in the script that are unnecessery for one or the other researcher. How >can >we avoid this? Is there another way of conducting the collaboration. Other >workflow? > >I'm perhaps looking for something like: >"If the script is run on researcher 1 computer, load file from this >directory. If the script is run on researcher 2 computer, load data from >that directory". > >Example: >## Import data- ># Researcher 1 import data from laptop1, unnecessery line for Researcher 2 >data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") > ># Researcher 2 import data from laptop2 (unnecessery line for Researcher >1) >data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") > >## Clean data >data$var1 <- NULL > >## Analyze data >boxplot(data$var2) > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to import sensitive data when multiple users collaborate on R-script?
Assume everyone will begin their work in a suitable working directory for their computer. Put data in that working directory or some directory "near" it. Then use relative paths to the data instead of absolute paths (don't use paths that start with "/"). I usually start by reading in a "configuration" file that I keep customized for per computer, that includes such things as the names of files I want to analyze. Sometimes there is only one row in that file, other times I select one row on the fly to use. -- Sent from my phone. Please excuse my brevity. On May 31, 2016 3:44:21 AM PDT, Nikolai Stenfors wrote: >We conduct medical research and our datafiles therefore contain >sensitive >data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, >GitHub). >When we collaborate on a r-analysis-script, we stumble upon the >following >annoyance. Researcher 1 has a line in the script importing the >sensitive >data from his/her personal computer. Researcher 2 has to put an >additional >line importing the data from his/her personal computer. Thus, we have >lines >in the script that are unnecessery for one or the other researcher. How >can >we avoid this? Is there another way of conducting the collaboration. >Other >workflow? > >I'm perhaps looking for something like: >"If the script is run on researcher 1 computer, load file from this >directory. If the script is run on researcher 2 computer, load data >from >that directory". > >Example: >## Import data- ># Researcher 1 import data from laptop1, unnecessery line for >Researcher 2 >data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") > ># Researcher 2 import data from laptop2 (unnecessery line for >Researcher 1) >data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") > >## Clean data >data$var1 <- NULL > >## Analyze data >boxplot(data$var2) > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to import sensitive data when multiple users collaborate on R-script?
On Tue, May 31, 2016 at 5:44 AM, Nikolai Stenfors < nikolai.stenf...@gapps.umu.se> wrote: > We conduct medical research and our datafiles therefore contain sensitive > data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, > GitHub). > When we collaborate on a r-analysis-script, we stumble upon the following > annoyance. Researcher 1 has a line in the script importing the sensitive > data from his/her personal computer. Researcher 2 has to put an additional > line importing the data from his/her personal computer. Thus, we have lines > in the script that are unnecessery for one or the other researcher. How can > we avoid this? Is there another way of conducting the collaboration. Other > workflow? > > I'm perhaps looking for something like: > "If the script is run on researcher 1 computer, load file from this > directory. If the script is run on researcher 2 computer, load data from > that directory". > > Example: > ## Import data- > # Researcher 1 import data from laptop1, unnecessery line for Researcher 2 > data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") > > # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1) > data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") > > ## Clean data > data$var1 <- NULL > > ## Analyze data > boxplot(data$var2) > > Can you have the researchers input the name of the data file to be analyzed? I use code similar to: arguments <- commandArgs(trailingOnly=TRUE); # # I put in the next command due to my own ignorance # If you invoke an R script file using just R, you # need to say something like: # R BATCH CMD script.R --args ... other arguments ... # # but if you use Rscript, you invoke it like: # Rscript script.R ... other arguments ... # # Well, I got confused and did: # Rscript script.R --args ... other arguments ... # # The next line adjusts for my own idiocy. if ("--args" == arguments[1]) arguments <- arguments[-1]; # for (file in arguments) { ... } Please ignore the line about my own idiocy :-} Another thought is to use an environment variable which is set in the user's logon profile (or the Windows registry, forgive my ignorance of Windows). I think this would be something like: filename <- Sys.getenv("FILENAME") if (filename = "") { ... no file name in environment, what to do? } You could have someone do this for the user, if he is not familiar with the process. -- The unfacts, did we have them, are too imprecisely few to warrant our certitude. Maranatha! <>< John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to import sensitive data when multiple users collaborate on R-script?
My general approach to this is to put the function for loading data into a separate file which is then sourced in the main analysis file. Occasionally I'll use a construct like: if file.exists("loadData_local.R") { source("loadData_local.R") }else{ source("loadData_generic.R") } Where loadData_generic.R contains the path to some sample (non-sensitive) data. On Tue, May 31, 2016 at 6:44 AM, Nikolai Stenfors wrote: > We conduct medical research and our datafiles therefore contain sensitive > data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, GitHub). > When we collaborate on a r-analysis-script, we stumble upon the following > annoyance. Researcher 1 has a line in the script importing the sensitive > data from his/her personal computer. Researcher 2 has to put an additional > line importing the data from his/her personal computer. Thus, we have lines > in the script that are unnecessery for one or the other researcher. How can > we avoid this? Is there another way of conducting the collaboration. Other > workflow? > > I'm perhaps looking for something like: > "If the script is run on researcher 1 computer, load file from this > directory. If the script is run on researcher 2 computer, load data from > that directory". > > Example: > ## Import data- > # Researcher 1 import data from laptop1, unnecessery line for Researcher 2 > data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") > > # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1) > data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") > > ## Clean data > data$var1 <- NULL > > ## Analyze data > boxplot(data$var2) > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to import sensitive data when multiple users collaborate on R-script?
We conduct medical research and our datafiles therefore contain sensitive data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, GitHub). When we collaborate on a r-analysis-script, we stumble upon the following annoyance. Researcher 1 has a line in the script importing the sensitive data from his/her personal computer. Researcher 2 has to put an additional line importing the data from his/her personal computer. Thus, we have lines in the script that are unnecessery for one or the other researcher. How can we avoid this? Is there another way of conducting the collaboration. Other workflow? I'm perhaps looking for something like: "If the script is run on researcher 1 computer, load file from this directory. If the script is run on researcher 2 computer, load data from that directory". Example: ## Import data- # Researcher 1 import data from laptop1, unnecessery line for Researcher 2 data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1) data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") ## Clean data data$var1 <- NULL ## Analyze data boxplot(data$var2) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.