On Mar 24, 2009, at 8:57 PM, Rolf Turner wrote:


On 25/03/2009, at 12:09 PM, Frank E Harrell Jr wrote:

        <snip>

(2) Scrolling down to ``Byar and Green prostate cancer data'' appeared
to get
me to the right place. But I couldn't see any signs of any ``R binary
files''.

Please look again.  It's under the heading "R".  Unfortunately I used
.sav suffix for save() files in the old days.

        Ah-ha.  Oh me of little faith.  I have been hanging around (in
        my current work environment) with too many SPSS users, and the
        *.sav extension seems to be the standard for SPSS data files.
        Whence my corrupted thinking.

The .xls fine opened with no problem in OpenOffice; has 506 rows.

        Hmmm.  When I opened it with Excel on the Mac I got a spread
        sheet with 503 rows --- the first row being the column names,
        so there were really 502 rows.

The last "patnr" is "506" but there are only 502 lines of data. 471, 473, 475 and 488 are missing.

And the CMU Statlib version for 2002 looks the same.
http://lib.stat.cmu.edu/S/Harrell/data/descriptions/prostate.html

The version at this site is missing more than 25 cases:
http://www.imbi.uni-freiburg.de/biom/Royston-Sauerbrei-book/

Here are two other copies of the dataset the first of which appears to have those missing cases:
This one has patient numbers:
http://lib.stat.cmu.edu/datasets/Andrews/T46.1

This one has a description of the fields and cites the one above but has not retained the patient numbers and has apparently only kept the 475 cases with complete data.
http://www.stats.waikato.ac.nz/Staff/maj/multimix/cancerdesc.txt
http://www.stats.waikato.ac.nz/Staff/maj/multimix/cancer%20data.txt



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to