Dear R-helpers,

I’m writing for advice on whether I should use R or a different package or
language. I’ve looked through the R-help archives, some manuals, and some
other sites as well, and I haven’t done too well finding relevant info,
hence my question here.

I’m working with hierarchical data (in SPSS lingo). That is, for each case
(person) I read in three types of (medical) record:

1. demographic data: name, age, sex, address, etc

2. ‘admissions’ data: this generally repeats, so I will have 20 or so
variables relating to their first hospital admission, then the same 20 again
for their second admission, and so on

3. ‘collections’ data, about 100 variables containing the results of a
battery of standard tests. These are administered at intervals and so this
is repeating data as well.

The number of repetitions varies between cases, so in its one case per line
format the data is non-rectangular.

At present I have shoehorned all of this into SPSS, with each case on one
line. My test database has 2,500 variables and 1,500 cases (or persons), and
in SPSS’s *.SAV format is ~4MB. The one I finally work with will be larger
again, though likely within one order of magnitude. Down the track, funding
permitting, I hope to be working with tens of thousands of cases.

I am wondering if I should keep using SPSS, or try something else.

The types of analysis I’ll typically will have to do will involve comparing
measurements at different times, e.g. before/ after treatment. I’ll also
need to compare groups of people, e.g. treatment / no treatment. Regression
and factor analyses will doubtless come into it at some point too.

So:

1. should I use R or try something else?

2. can anyone advise me on using R with the type of data I’ve described?


Many thanks,

Anton du Toit

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to