Is R an appropriate tool for data manipulation and data reshaping and data
organizing? I think so but someone who recently joined our group thinks not.
The new recruit believes that python or another language is a far better
tool for developing data manipulation scripts that can be then used by
several members of our research group. Her assessment is that R is useful
only when it comes to data analysis and working with statistical models.
So what do you think:
1)R is a phenomenally powerful and flexible tool and since you are going to
do analyses in R you might as well use it to read data in and merge it and
reshape it to whatever you need.
OR
2) Are you crazy? Nobody in their right mind uses R to pipe the data around
their lab and assemble it for analysis.

Your insights would be appreciated.

Details if you are interested:

Our setup: Hundreds of patients recorded as cases with about 60 variables.
Inputted and stored in a Sybase relational database. High throughput SNP
genotyping platforms saved data output to csv or excel tables. Previously,
not knowing any SQL I had used Microsoft Access to write queries to get the
data that I needed and to merge the genotyping with the clinical database.
It was horrible. I could not even use it on anything other than my desktop
machine at work. When I realized that I was going to need to learn R to
handle the genetic analyses I decided to keep Sybase as the data repository
for the clinical information and the do all the data manipulation, merging
and piping with R using RODBC. I was and am a very amateur coder.
Nevertheless, many many hours later I have scripts that did what I needed
them to do and I understand R code and can tinker with it as needed. My
scripts work for me but they are not exactly user-friendly for others in the
laboratory to just run. For instance, depending on what machine the script
is being run from, one may need to change the file name or file path and
tinker under the hood to accomplish that. My bias is to fulfill all our data
manipulation and reshaping with R. Since I am the principal investigator it
is me who stays constant and coders or analysts who may come and go.

I am even more enamored with R for data manipulation since reading a book
about it.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to