Re: [R] How to import sensitive data when multiple users collaborate on R-script?

2016-05-31 Thread MacQueen, Don
There are lots of ways to handle this kind of thing, and the other
suggestions are good. But specific to your "something like" idea, see the
output of

  Sys.info()

in particular
  Sys.info()['nodename']
  Sys.info()['user']

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 5/31/16, 3:44 AM, "R-help on behalf of Nikolai Stenfors"

wrote:

>We conduct medical research and our datafiles therefore contain sensitive
>data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket,
>GitHub).
>When we collaborate on a r-analysis-script, we stumble upon the following
>annoyance. Researcher 1 has a line in the script importing the sensitive
>data from his/her personal computer. Researcher 2 has to put an additional
>line importing the data from his/her personal computer. Thus, we have
>lines
>in the script that are unnecessery for one or the other researcher. How
>can
>we avoid this? Is there another way of conducting the collaboration. Other
>workflow? 
>
>I'm perhaps looking for something like:
>"If the script is run on researcher 1 computer, load file from this
>directory. If the script is run on researcher 2 computer, load data from
>that directory". 
>
>Example:
>## Import data-
># Researcher 1 import data from laptop1, unnecessery line for Researcher 2
>data <- read.table("/path/to_researcher1_computer/sensitive_data.csv")
>
># Researcher 2 import data from laptop2 (unnecessery line for Researcher
>1)
>data <- read.table("/path/to_researcher2_computer/sensitive_data.csv")
>
>## Clean data
>data$var1 <- NULL
>
>## Analyze data
>boxplot(data$var2)
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to import sensitive data when multiple users collaborate on R-script?

2016-05-31 Thread Jeff Newmiller
Assume everyone will begin their work in a suitable working directory for their 
computer. Put data in that working directory or some directory "near" it. Then 
use relative paths to the data instead of absolute paths (don't use paths that 
start with "/"). I usually start by reading in a "configuration" file that I 
keep customized for per computer, that includes such things as the names of 
files I want to analyze. Sometimes there is only one row in that file, other 
times I select one row on the fly to use. 
-- 
Sent from my phone. Please excuse my brevity.

On May 31, 2016 3:44:21 AM PDT, Nikolai Stenfors 
 wrote:
>We conduct medical research and our datafiles therefore contain
>sensitive
>data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket,
>GitHub).
>When we collaborate on a r-analysis-script, we stumble upon the
>following
>annoyance. Researcher 1 has a line in the script importing the
>sensitive
>data from his/her personal computer. Researcher 2 has to put an
>additional
>line importing the data from his/her personal computer. Thus, we have
>lines
>in the script that are unnecessery for one or the other researcher. How
>can
>we avoid this? Is there another way of conducting the collaboration.
>Other
>workflow? 
>
>I'm perhaps looking for something like:
>"If the script is run on researcher 1 computer, load file from this
>directory. If the script is run on researcher 2 computer, load data
>from
>that directory". 
>
>Example:
>## Import data-
># Researcher 1 import data from laptop1, unnecessery line for
>Researcher 2
>data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") 
>
># Researcher 2 import data from laptop2 (unnecessery line for
>Researcher 1)
>data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") 
>
>## Clean data
>data$var1 <- NULL
>
>## Analyze data
>boxplot(data$var2)
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to import sensitive data when multiple users collaborate on R-script?

2016-05-31 Thread John McKown
On Tue, May 31, 2016 at 5:44 AM, Nikolai Stenfors <
nikolai.stenf...@gapps.umu.se> wrote:

> We conduct medical research and our datafiles therefore contain sensitive
> data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket,
> GitHub).
> When we collaborate on a r-analysis-script, we stumble upon the following
> annoyance. Researcher 1 has a line in the script importing the sensitive
> data from his/her personal computer. Researcher 2 has to put an additional
> line importing the data from his/her personal computer. Thus, we have lines
> in the script that are unnecessery for one or the other researcher. How can
> we avoid this? Is there another way of conducting the collaboration. Other
> workflow?
>
> I'm perhaps looking for something like:
> "If the script is run on researcher 1 computer, load file from this
> directory. If the script is run on researcher 2 computer, load data from
> that directory".
>
> Example:
> ## Import data-
> # Researcher 1 import data from laptop1, unnecessery line for Researcher 2
> data <- read.table("/path/to_researcher1_computer/sensitive_data.csv")
>
> # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1)
> data <- read.table("/path/to_researcher2_computer/sensitive_data.csv")
>
> ## Clean data
> data$var1 <- NULL
>
> ## Analyze data
> boxplot(data$var2)
>
>
​Can you have the researchers input the name of the data file to be
analyzed? I use code similar to:

arguments <- commandArgs(trailingOnly=TRUE);
#
# I put in the next command due to my own ignorance
# If you invoke an R script file using just R, you
# need to say something like:
# R BATCH CMD script.R --args ... other arguments ...
#
# but if you use Rscript, you invoke it like:
# Rscript script.R ... other arguments ...
#
# Well, I got confused and did:
# Rscript script.R --args ... other arguments ...
#
# The next line adjusts for my own idiocy.
if ("--args" == arguments[1]) arguments <- arguments[-1];
#
for (file in arguments) {
...
}

Please ignore the line about my own idiocy :-}

Another thought is to use an environment variable which is set in the
user's logon profile (or the Windows registry, forgive my ignorance of
Windows). I think this would be something like:

filename <- Sys.getenv("FILENAME")
if (filename = "") {
... no file name in environment, what to do?
}

You could have someone do this for the user, if he is not familiar with ​
the process.
​


-- 
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to import sensitive data when multiple users collaborate on R-script?

2016-05-31 Thread Tom Wright
My general approach to this is to put the function for loading data
into a separate file which is then sourced in the main analysis file.
Occasionally I'll use a construct like:

if file.exists("loadData_local.R")
  {
source("loadData_local.R")
  }else{
source("loadData_generic.R")
  }

Where loadData_generic.R contains the path to some sample (non-sensitive) data.

On Tue, May 31, 2016 at 6:44 AM, Nikolai Stenfors
 wrote:
> We conduct medical research and our datafiles therefore contain sensitive
> data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, GitHub).
> When we collaborate on a r-analysis-script, we stumble upon the following
> annoyance. Researcher 1 has a line in the script importing the sensitive
> data from his/her personal computer. Researcher 2 has to put an additional
> line importing the data from his/her personal computer. Thus, we have lines
> in the script that are unnecessery for one or the other researcher. How can
> we avoid this? Is there another way of conducting the collaboration. Other
> workflow?
>
> I'm perhaps looking for something like:
> "If the script is run on researcher 1 computer, load file from this
> directory. If the script is run on researcher 2 computer, load data from
> that directory".
>
> Example:
> ## Import data-
> # Researcher 1 import data from laptop1, unnecessery line for Researcher 2
> data <- read.table("/path/to_researcher1_computer/sensitive_data.csv")
>
> # Researcher 2 import data from laptop2 (unnecessery line for Researcher 1)
> data <- read.table("/path/to_researcher2_computer/sensitive_data.csv")
>
> ## Clean data
> data$var1 <- NULL
>
> ## Analyze data
> boxplot(data$var2)
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to import sensitive data when multiple users collaborate on R-script?

2016-05-31 Thread Nikolai Stenfors
We conduct medical research and our datafiles therefore contain sensitive
data, not to be shared in the cloud (Dropboc, Box, Drive, Bitbucket, GitHub).
When we collaborate on a r-analysis-script, we stumble upon the following
annoyance. Researcher 1 has a line in the script importing the sensitive
data from his/her personal computer. Researcher 2 has to put an additional
line importing the data from his/her personal computer. Thus, we have lines
in the script that are unnecessery for one or the other researcher. How can
we avoid this? Is there another way of conducting the collaboration. Other
workflow? 

I'm perhaps looking for something like:
"If the script is run on researcher 1 computer, load file from this
directory. If the script is run on researcher 2 computer, load data from
that directory". 

Example:
## Import data-
# Researcher 1 import data from laptop1, unnecessery line for Researcher 2
data <- read.table("/path/to_researcher1_computer/sensitive_data.csv") 

# Researcher 2 import data from laptop2 (unnecessery line for Researcher 1)
data <- read.table("/path/to_researcher2_computer/sensitive_data.csv") 

## Clean data
data$var1 <- NULL

## Analyze data
boxplot(data$var2)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.