Re: [R] R and SAS proc format

2007-03-07 Thread Ulrike Grömping
The down side to R's factor solution:

The numerical values of factors are always 1 to number of levels. Thus, it

can be tough and requires great care to work with studies that have both

numerical values different from this and value labels. This situation is

currently not well-supported by R.



Regards, Ulrike



P.S.: I fully agree with Frank regarding the annoyance one sometimes

encounters with formats in SAS!

 You can add an attribute to a variable.  In the sas.get function in the 
 Hmisc package for example, when importing SAS variables that have PROC 
 FORMAT value labels, an attribute 'sas.codes' keeps the original codes; 
 these can be retrieved using sas.codes(variable name).  This could be 
 done outside the SAS import context also. 
 
 Frank 
 -- 
 Frank E Harrell Jr   Professor and Chair           School of Medicine 
                      Department of Biostatistics   Vanderbilt University

Frank,

are these attributes preserved when merging or subsetting a data frame?
Are they used in R packages other than Hmisc and Design (e.g. in a simple table 
request)?

If this is the case, my wishlist items 8658 and 8659 
(http://bugs.r-project.org/cgi-bin/R/wishlist?id=8658;user=guest, 
http://bugs.r-project.org/cgi-bin/R/wishlist?id=8659;user=guest) can be closed. 
Otherwise, I maintain the opinion that there are workarounds but that R is not 
satisfactorily able to handle this type of data.

Regards, Ulrike

--- End of Original Message ---
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-07 Thread Frank E Harrell Jr
Ulrike Grömping wrote:
 
 
The down side to R's factor solution: 
 
The numerical values of factors are always 1 to number of levels. Thus, it
 
can be tough and requires great care to work with studies that have both
 
numerical values different from this and value labels. This situation is
 
currently not well-supported by R.
 

 
Regards, Ulrike
 

 
P.S.: I fully agree with Frank regarding the annoyance one sometimes
 
encounters with formats in SAS! 
 
   You can add an attribute to a variable.  In the sas.get function in the
   Hmisc package for example, when importing SAS variables that have PROC
   FORMAT value labels, an attribute 'sas.codes' keeps the original codes;
   these can be retrieved using sas.codes(variable name).  This could be
   done outside the SAS import context also.
  
   Frank
   --
   Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt 
 University
 
 Frank,
 
 are these attributes preserved when merging or subsetting a data frame?
 Are they used in R packages other than Hmisc and Design (e.g. in a 
 simple table request)?

no; would need to add functions like those that are used by the Hmisc 
label or impute functions.  And they are not used outside Hmisc/Design. 
  In fact I have little need for them as I always find the final labels 
as the key to analysis.

 
 If this is the case, my wishlist items 8658 and 8659 
 (http://bugs.r-project.org/cgi-bin/R/wishlist?id=8658;user=guest, 
 http://bugs.r-project.org/cgi-bin/R/wishlist?id=8659;user=guest) can be 
 closed.
 Otherwise, I maintain the opinion that there are workarounds but that R 
 is not satisfactorily able to handle this type of data.

R gives the framework for doing this elegantly but the user has an 
overhead of implementing new methods for such attributes.

Cheers
Frank

 
 Regards, Ulrike
 
 
 *--- End of Original Message ---*


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-07 Thread Peter Dalgaard
Jason Barnhart wrote:
 - Original Message - 
 From: John Kane [EMAIL PROTECTED]
 To: lamack lamack [EMAIL PROTECTED]; R-help@stat.math.ethz.ch
 Sent: Tuesday, March 06, 2007 2:13 PM
 Subject: Re: [R] R and SAS proc format


   
 --- lamack lamack [EMAIL PROTECTED] wrote:

 
 Dear all, Is there an R equivalent to SAS's proc
 format?
   
 What does the SAS PROC FORMAT do?
 

 It formats or reformats data in the SAS system.
   

Slightly more precisely: It creates user-defined formats, which are
subsequently associated with variables and used for reading, printing,
tabulating, and analyzing data. It is akin to R's factor()
constructions, but not quite. For one thing, SAS's formats are separate
entities - same format can be used for many variables, whereas R's
factors have the formatting coded as a part of the object. For related
reasons, a variable in SAS can have more distinct values than there are
value labesl for, etc. 
 It looks this:

 proc format; value kanefmt 1='A' 2='B' 3='C' 4='X' 5='Throw me 
 out';
 data temp; do i=1 to 10; kanevar=put(i,kanefmt.); output; end;
 proc print; run;

 And produces this:

 Obs i  kanevar
   1 1A
   2 2B
   3 3C
   4 4X
   5 5Throw me out
   6 6   6
   7 7   7
   8 8   8
   9 9   9
  1010  10


 But it is more robust than what is shown here.



   
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-07 Thread Carlos J. Gil Bellosta
On 3/7/07, Peter Dalgaard [EMAIL PROTECTED] wrote:
 Jason Barnhart wrote:
  - Original Message -
  From: John Kane [EMAIL PROTECTED]
  To: lamack lamack [EMAIL PROTECTED]; R-help@stat.math.ethz.ch
  Sent: Tuesday, March 06, 2007 2:13 PM
  Subject: Re: [R] R and SAS proc format
 
 
 
  --- lamack lamack [EMAIL PROTECTED] wrote:
 
 
  Dear all, Is there an R equivalent to SAS's proc
  format?
 
  What does the SAS PROC FORMAT do?
 
 
  It formats or reformats data in the SAS system.
 

 Slightly more precisely: It creates user-defined formats, which are
 subsequently associated with variables and used for reading, printing,
 tabulating, and analyzing data. It is akin to R's factor()
 constructions, but not quite. For one thing, SAS's formats are separate
 entities - same format can be used for many variables, whereas R's
 factors have the formatting coded as a part of the object. For related
 reasons, a variable in SAS can have more distinct values than there are
 value labesl for, etc.
  It looks this:
 
  proc format; value kanefmt 1='A' 2='B' 3='C' 4='X' 5='Throw me
  out';
  data temp; do i=1 to 10; kanevar=put(i,kanefmt.); output; end;
  proc print; run;
 
  And produces this:
 
  Obs i  kanevar
1 1A
2 2B
3 3C
4 4X
5 5Throw me out
6 6   6
7 7   7
8 8   8
9 9   9
   1010  10
 
 
  But it is more robust than what is shown here.
 
 
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Also, SAS formats are used as a (somewhat cumbersome) replacement for
dictionary data structures. Starting from SAS 9.1 (I believe), hash
tables can be used within data steps for the same purpose (albeit
still cumbersome).

In this regard, not only formats but also lists could be a replacement
for them. They can be used as a way to get key-value mappings.

These key-value mappings (I mean, these kind of data structures) are
very handy tools. I have used both factors and lists for some kind of
ad hoc replacement for these data structures. Hasn't anybody
considered the posibility of having these data structures implemented
in R in a much python-like or java-like touch and feel?

Regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and SAS proc format

2007-03-06 Thread lamack lamack
Dear all, Is there an R equivalent to SAS's proc format?

Best regards

J. Lamack

_
O Windows Live Spaces é seu espaço na internet com fotos (500 por mês), blog 
e agora com rede social http://spaces.live.com/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread bogdan romocea
See ?cut for continuous variables, and ?factor, ?levels for the others.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of lamack lamack
 Sent: Tuesday, March 06, 2007 12:49 PM
 To: R-help@stat.math.ethz.ch
 Subject: [R] R and SAS proc format

 Dear all, Is there an R equivalent to SAS's proc format?

 Best regards

 J. Lamack

 _
 O Windows Live Spaces é seu espaço na internet com fotos (500
 por mês), blog
 e agora com rede social http://spaces.live.com/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread Ulrike Grömping

The down side to R's factor solution: 
The numerical values of factors are always 1 to number of levels. Thus, it
can be tough and requires great care to work with studies that have both
numerical values different from this and value labels. This situation is
currently not well-supported by R.

Regards, Ulrike

P.S.: I fully agree with Frank regarding the annoyance one sometimes
encounters with formats in SAS! 


lamack lamack wrote:
 
 Dear all, Is there an R equivalent to SAS's proc format?
 
 Best regards
 
 J. Lamack
 
 _
 O Windows Live Spaces é seu espaço na internet com fotos (500 por mês),
 blog 
 e agora com rede social http://spaces.live.com/
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/R-and-SAS-proc-format-tf3357624.html#a9340323
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread Frank E Harrell Jr
lamack lamack wrote:
 Dear all, Is there an R equivalent to SAS's proc format?
 
 Best regards
 
 J. Lamack

Fortunately not.  SAS is one of the few large systems that does not 
implicitly support value labels and that separates label information 
from the database [I can't count the number of times someone has sent me 
a SAS dataset and forgotten to send the PROC FORMAT value labels].  See 
the factor function for information about how R does this.

Frank Harrell

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread John Kane

--- lamack lamack [EMAIL PROTECTED] wrote:

 Dear all, Is there an R equivalent to SAS's proc
 format?

What does the SAS PROC FORMAT do?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread Frank E Harrell Jr
Ulrike Grömping wrote:
 The down side to R's factor solution: 
 The numerical values of factors are always 1 to number of levels. Thus, it
 can be tough and requires great care to work with studies that have both
 numerical values different from this and value labels. This situation is
 currently not well-supported by R.

You can add an attribute to a variable.  In the sas.get function in the 
Hmisc package for example, when importing SAS variables that have PROC 
FORMAT value labels, an attribute 'sas.codes' keeps the original codes; 
these can be retrieved using sas.codes(variable name).  This could be 
done outside the SAS import context also.

Frank

 
 Regards, Ulrike
 
 P.S.: I fully agree with Frank regarding the annoyance one sometimes
 encounters with formats in SAS! 
 
 
 lamack lamack wrote:
 Dear all, Is there an R equivalent to SAS's proc format?

 Best regards

 J. Lamack

 _
 O Windows Live Spaces é seu espaço na internet com fotos (500 por mês),
 blog 
 e agora com rede social http://spaces.live.com/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and SAS proc format

2007-03-06 Thread Jason Barnhart

- Original Message - 
From: John Kane [EMAIL PROTECTED]
To: lamack lamack [EMAIL PROTECTED]; R-help@stat.math.ethz.ch
Sent: Tuesday, March 06, 2007 2:13 PM
Subject: Re: [R] R and SAS proc format



 --- lamack lamack [EMAIL PROTECTED] wrote:

 Dear all, Is there an R equivalent to SAS's proc
 format?

 What does the SAS PROC FORMAT do?

It formats or reformats data in the SAS system.

It looks this:

proc format; value kanefmt 1='A' 2='B' 3='C' 4='X' 5='Throw me 
out';
data temp; do i=1 to 10; kanevar=put(i,kanefmt.); output; end;
proc print; run;

And produces this:

Obs i  kanevar
  1 1A
  2 2B
  3 3C
  4 4X
  5 5Throw me out
  6 6   6
  7 7   7
  8 8   8
  9 9   9
 1010  10


But it is more robust than what is shown here.




 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.