Re: [R] Deleting colmuns with 0's and also writing multple csv files

2010-02-18 Thread Peter Dalgaard

(Forgot to cc. reply to K. Elo, apologies if you get it twice)

K. Elo wrote:

Dear Anna,

19.02.2010 08:17, Anna Carter wrote:

(1) If the dataset contains some variables having all the entries = 0
and while analysing I want to delete those pericular columns, how do
acheive this. i.e.


Let's suppose 'df' is your data frame, then:

subset(df, select=which(colSums(df)!=0))

should do the work :)



Beware negative entries in df! which(colSums(df!=0)) may work better,
but it is a bit "sneaky".

I'd also avoid subset in favour of df[] or  df[,]. And why use
indexing with which() when you can use the logical index directly?

My preference goes to df[,apply(df,2,any)] (a student assistant once
almost killed me when I showed her that after she had spent days
programming the same thing using loops and whatnots...)

--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deleting colmuns with 0's and also writing multple csv files

2010-02-18 Thread RICHARD M. HEIBERGER
> (2)  Suppose I have variable no of datasets 'say n = 10'. I wish to write a 
> loop assigning each of these datasets to diffrent csv files e.g.
>
> for (i in 1:10)
>  {
> write.csv(data.frame(dataset[,,i]), 'data_set[i].csv', row.names = FALSE)
>  }
>
> The result of this command is generation of a csv file 'data_set[i].csv' 
> containing the last dataset (owing to the wrong command written by me).
>
> What I need is creation of say data_set[1].csv, data_set[2].csv,  
> .data_set[10].csv i.e. 10 different csv files containing 10 different 
> datasets.


Why do you want to create csv datasets?  They lose much of the
structure that is in the R object.
If you are trying to transfer them to somewhere else, then a direct
transfer would be a better choice.

If the goal is Excel, there are about 5 options.  I prefer RExcel
because it allows the tightest
coordination of the R and the Excel calculations.

If the goal is one of the other popular statistical systems, they also
have direct connections,
often through the foreign package.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with RGtk

2010-02-18 Thread David Winsemius


On Feb 18, 2010, at 11:32 PM, Wincent wrote:


Have you tried install GTK2 manually first?
It can be downloaded from http://r.research.att.com/gtk2-runtime.dmg



That is for Tiger builds.


HTH

Ronggui

On 19 February 2010 10:02, Anna Oganyan  
wrote:



Dear List,
I would like to ask about package RGtk2 with which I have a problem.
I will very much appreciate if somebody could tell me what I need  
to do.
I need to install a  package scdMicro and it depends on  
gWidgetsRGtk2.

I am working on MAC, version 10.5.8.
When I try to load gWidgetsRGtk2 (or RGtk2), it asks me:
"Instal GTK+?"

I installed Gtk+ from CRAN, and added a path:

export PATH=$PATH:/Library/Frameworks/GTK+.framework/Resources/bin

But next time I try to load gWidgetsRGtk2 (or RGtk2), I get the same
message: "Instal GTK+?"

Below I included the error message I got.

__
Loading required package: gWidgetsRGtk2
Loading required package: gWidgets
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared library
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so,

6): Library not loaded:
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk- 
x11-2.0.0.dylib

Referenced from:
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0

or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
In addition: Warning message:
In fun(...) : couldn't connect to display "/tmp/launch-WrMGBi/:0"
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded
starting httpd help server ... done
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared library
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so,

6): Library not loaded:
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk- 
x11-2.0.0.dylib

Referenced from:
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0

or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package/namespace load failed for 'gWidgetsRGtk2'
Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"),  
class =

"tclObj") :
[tcl] invalid command name "option".

Error : .onLoad failed in 'loadNamespace' for 'gWidgetstcltk'
Error: package/namespace load failed for 'gWidgetstcltk'
Loading required package: proto
Loading required package: filehash
Error: package 'filehash' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE,  
lib.loc =

lib.loc) :
there is no package called 'filehash'
Loading required package: gWidgetsRGtk2
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared library
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so,

6): Library not loaded:
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk- 
x11-2.0.0.dylib

Referenced from:
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0

or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded

___


Thank you very much in advance. I will appreciate very much any
suggestions.

Anna Ogan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Wincent Ronggui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

_

Re: [R] Deleting colmuns with 0's and also writing multple csv files

2010-02-18 Thread David Winsemius


On Feb 19, 2010, at 1:36 AM, K. Elo wrote:


Dear Anna,

19.02.2010 08:17, Anna Carter wrote:

(1) If the dataset contains some variables having all the entries = 0
and while analysing I want to delete those pericular columns, how do
acheive this. i.e.


Let's suppose 'df' is your data frame, then:

subset(df, select=which(colSums(df)!=0))

should do the work :)


It would not work if there were paired negative and positive values or  
any collection that summed to zero.


> dataset1 <-
structure(list(sr_no = 1:4, var1 = c(5L, 3L, 4L, 11L), var2 = c(0,
0, 1, -1), var3 = c(3L, 2L, 4L, 1L), var4 = c(1L, 9L, 7L, 6L),
var5 = c(0L, 0L, 0L, 0L)), .Names = c("sr_no", "var1", "var2",
"var3", "var4", "var5"), row.names = c(NA, -4L), class = "data.frame")

Perhaps:
> idx <- vector()
> for (x in seq_along(names(dataset1))) if (all(dataset1[, x] == 0))  
{ } else{ idx<- c(idx, x)}


> dataset1[, idx]
  sr_no var1 var2 var3 var4
1 15031
2 23029
3 34147
4 4   11   -116

Or a modification to Kimmo Elo's code which would still "break" if any  
columns were character:


> subset(dataset1, select=which(colSums(abs(dataset1))!=0))
  sr_no var1 var2 var3 var4
1 15031
2 23029
3 34147
4 4   11   -116



HTH,
Kimmo



--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deleting colmuns with 0's and also writing multple csv files

2010-02-18 Thread K. Elo
Dear Anna,

19.02.2010 08:17, Anna Carter wrote:
> (1) If the dataset contains some variables having all the entries = 0
> and while analysing I want to delete those pericular columns, how do
> acheive this. i.e.

Let's suppose 'df' is your data frame, then:

subset(df, select=which(colSums(df)!=0))

should do the work :)

HTH,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] color graph in multiple plots

2010-02-18 Thread David Winsemius


On Feb 19, 2010, at 12:38 AM, Roslina Zakaria wrote:


Hi,

I would like to distinguish my plots using colors but I got error  
message.  How do I correct that?



plot(ecdf(z), main ="CDF for observed and simulated weighted  
sum",type="l",lwd=2,col="blue",
xlab="Weighted sum (mm)", ylab="Cumulative Percent", xlim=c(0,15),  
xaxs ='i', yaxs ='i',ylim=c(0,1))

par(new=TRUE)
plot(gen[,1],gen[,2],type="l",lwd=2,ann=FALSE,axes=FALSE,  
col="red",xlim=c(0,15),ylim=c(0,1),xaxs ='i', yaxs ='i')

legend(10,.2,c("obs","pre"),col=c("blue","red"),lwd=c(1,2))

plot(ecdf(z), main ="CDF for observed and simulated weighted  
sum",type="l",lwd=2,
+ xlab="Weighted sum (mm)", ylab="Cumulative Percent", xlim=c(0,15),  
xaxs ='i', yaxs ='i',ylim=c(0,1))
Error in plot.default(0, 0, type = "n", xlim = xlim, ylim = ylim,  
xlab = xlab,  :

  formal argument "type" matched by multiple actual arguments


Drop the type argument from the plot call. The plot.ecdf hands off to  
plot.stepfun, which is hard coded for type = "n" and then it plots  
with the segments segments function for the steps. Not much more can  
be said, since you provided neither "z" nor "gen".


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Deleting colmuns with 0's and also writing multple csv files

2010-02-18 Thread Anna Carter
Dear R helpers,
 
I have two queries.
 
(1) If the dataset contains some variables having all the entries = 0 and while 
analysing I want to delete those pericular columns, how do acheive this. i.e.
 
dataset1
 
sr_no    var1  var2   var3    var4    var5
1   5   0  3  1   0
2   3   0  2  9   0
3   4   0  4  7   0 
4  11  0  1  6   0
 
In the above dataset, var2 and var3 are all 0's so I don't want to select these 
columns. It is not that always these two variables will be zeros, so in general 
how the dataset can be filtered in order to have only non-zero columns.
 
(2)  Suppose I have variable no of datasets 'say n = 10'. I wish to write a 
loop assigning each of these datasets to diffrent csv files e.g.
 
for (i in 1:10)
 {
write.csv(data.frame(dataset[,,i]), 'data_set[i].csv', row.names = FALSE)
 }

The result of this command is generation of a csv file 'data_set[i].csv' 
containing the last dataset (owing to the wrong command written by me).
 
What I need is creation of say data_set[1].csv, data_set[2].csv,  
.data_set[10].csv i.e. 10 different csv files containing 10 different 
datasets.
Thanking you in advance

Anna
 


  Your Mail works best with the New Yahoo Optimized IE8. Get it NOW! 
http://downloads.yahoo.com/in/internetexplorer/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with RGtk

2010-02-18 Thread David Winsemius
I take back what I said about not being able to use RGtk2 with the 64- 
bit mac. I (just today, for the first time ) can now get RGtk2 to load  
and then to get rgl to run properly. I had been able to run rgl in the  
32 bit system but that was not satisfactory.  I had earlier installed  
the 2.18 version of Gtk2+ from the att.research.com site. It's  
possible that you got there from the CRAN website since they do have a  
link that displays inside a CRAN frame.


So it looks like you need to get this file:

http://r.research.att.com/libs/GTK_2.18.5-X11.pkg

It's possible that I failed to restart my system and that was needed.  
I don't know.


And then install RGtk2 and  gWidgetsRGtk2 and any dependencies and  
then load RGtk.


Best;
David.


On Feb 18, 2010, at 9:02 PM, Anna Oganyan wrote:


Dear List,
I would like to ask about package RGtk2 with which I have a problem.
I will very much appreciate if somebody could tell me what I need to  
do.

I need to install a  package scdMicro and it depends on gWidgetsRGtk2.
I am working on MAC, version 10.5.8.
When I try to load gWidgetsRGtk2 (or RGtk2), it asks me:
"Instal GTK+?"

I installed Gtk+ from CRAN, and added a path:

export PATH=$PATH:/Library/Frameworks/GTK+.framework/Resources/bin

But next time I try to load gWidgetsRGtk2 (or RGtk2), I get the same  
message: "Instal GTK+?"


Below I included the error message I got.
__
Loading required package: gWidgetsRGtk2
Loading required package: gWidgets
Error in dyn.load(file, DLLpath = DLLpath, ...) :
 unable to load shared library '/Users/aoganyan/Library/R/2.10/ 
library/RGtk2/libs/i386/RGtk2.so':
 dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so, 6): Library not loaded: /Library/Frameworks/GTK+.framework/ 
Resources/lib/libgtk-x11-2.0.0.dylib
 Referenced from: /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/ 
i386/RGtk2.so
 Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0 or later, but libgtk-x11-2.0.0.dylib provides version  
1401.0.0

In addition: Warning message:
In fun(...) : couldn't connect to display "/tmp/launch-WrMGBi/:0"
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded
starting httpd help server ... done
Error in dyn.load(file, DLLpath = DLLpath, ...) :
 unable to load shared library '/Users/aoganyan/Library/R/2.10/ 
library/RGtk2/libs/i386/RGtk2.so':
 dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so, 6): Library not loaded: /Library/Frameworks/GTK+.framework/ 
Resources/lib/libgtk-x11-2.0.0.dylib
 Referenced from: /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/ 
i386/RGtk2.so
 Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0 or later, but libgtk-x11-2.0.0.dylib provides version  
1401.0.0

Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package/namespace load failed for 'gWidgetsRGtk2'
Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"),  
class = "tclObj") :

 [tcl] invalid command name "option".

Error : .onLoad failed in 'loadNamespace' for 'gWidgetstcltk'
Error: package/namespace load failed for 'gWidgetstcltk'
Loading required package: proto
Loading required package: filehash
Error: package 'filehash' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE,  
lib.loc = lib.loc) :

 there is no package called 'filehash'
Loading required package: gWidgetsRGtk2
Error in dyn.load(file, DLLpath = DLLpath, ...) :
 unable to load shared library '/Users/aoganyan/Library/R/2.10/ 
library/RGtk2/libs/i386/RGtk2.so':
 dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/ 
RGtk2.so, 6): Library not loaded: /Library/Frameworks/GTK+.framework/ 
Resources/lib/libgtk-x11-2.0.0.dylib
 Referenced from: /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/ 
i386/RGtk2.so
 Reason: Incompatible library version: RGtk2.so requires version  
1801.0.0 or later, but libgtk-x11-2.0.0.dylib provides version  
1401.0.0

Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded

___


Thank you very much in advance. I will appreciate very much any  
suggestions.


Anna Ogan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code

[R] color graph in multiple plots

2010-02-18 Thread Roslina Zakaria
Hi,
 
I would like to distinguish my plots using colors but I got error message.  How 
do I correct that?
 
 
plot(ecdf(z), main ="CDF for observed and simulated weighted 
sum",type="l",lwd=2,col="blue",
xlab="Weighted sum (mm)", ylab="Cumulative Percent", xlim=c(0,15), xaxs ='i', 
yaxs ='i',ylim=c(0,1))
par(new=TRUE)
plot(gen[,1],gen[,2],type="l",lwd=2,ann=FALSE,axes=FALSE, 
col="red",xlim=c(0,15),ylim=c(0,1),xaxs ='i', yaxs ='i')
legend(10,.2,c("obs","pre"),col=c("blue","red"),lwd=c(1,2))

> plot(ecdf(z), main ="CDF for observed and simulated weighted 
> sum",type="l",lwd=2,
+ xlab="Weighted sum (mm)", ylab="Cumulative Percent", xlim=c(0,15), xaxs ='i', 
yaxs ='i',ylim=c(0,1))
Error in plot.default(0, 0, type = "n", xlim = xlim, ylim = ylim, xlab = xlab,  
: 
  formal argument "type" matched by multiple actual arguments
 
Thank you.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with RGtk

2010-02-18 Thread Wincent
Have you tried install GTK2 manually first?
It can be downloaded from http://r.research.att.com/gtk2-runtime.dmg

HTH

Ronggui

On 19 February 2010 10:02, Anna Oganyan wrote:

> Dear List,
> I would like to ask about package RGtk2 with which I have a problem.
> I will very much appreciate if somebody could tell me what I need to do.
> I need to install a  package scdMicro and it depends on gWidgetsRGtk2.
> I am working on MAC, version 10.5.8.
> When I try to load gWidgetsRGtk2 (or RGtk2), it asks me:
> "Instal GTK+?"
>
> I installed Gtk+ from CRAN, and added a path:
>
> export PATH=$PATH:/Library/Frameworks/GTK+.framework/Resources/bin
>
> But next time I try to load gWidgetsRGtk2 (or RGtk2), I get the same
> message: "Instal GTK+?"
>
> Below I included the error message I got.
>
> __
> Loading required package: gWidgetsRGtk2
> Loading required package: gWidgets
> Error in dyn.load(file, DLLpath = DLLpath, ...) :
>  unable to load shared library
> '/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
>  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so,
> 6): Library not loaded:
> /Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
>  Referenced from:
> /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
>  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0
> or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
> In addition: Warning message:
> In fun(...) : couldn't connect to display "/tmp/launch-WrMGBi/:0"
> Failed to load RGtk2 dynamic library, attempting to install it.
> Error in if (choice == 1) { : argument is of length zero
> Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
> Error: package 'gWidgetsRGtk2' could not be loaded
> starting httpd help server ... done
> Error in dyn.load(file, DLLpath = DLLpath, ...) :
>  unable to load shared library
> '/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
>  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so,
> 6): Library not loaded:
> /Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
>  Referenced from:
> /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
>  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0
> or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
> Failed to load RGtk2 dynamic library, attempting to install it.
> Error in if (choice == 1) { : argument is of length zero
> Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
> Error: package/namespace load failed for 'gWidgetsRGtk2'
> Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class =
> "tclObj") :
>  [tcl] invalid command name "option".
>
> Error : .onLoad failed in 'loadNamespace' for 'gWidgetstcltk'
> Error: package/namespace load failed for 'gWidgetstcltk'
> Loading required package: proto
> Loading required package: filehash
> Error: package 'filehash' could not be loaded
> In addition: Warning message:
> In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc =
> lib.loc) :
>  there is no package called 'filehash'
> Loading required package: gWidgetsRGtk2
> Error in dyn.load(file, DLLpath = DLLpath, ...) :
>  unable to load shared library
> '/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
>  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so,
> 6): Library not loaded:
> /Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
>  Referenced from:
> /Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
>  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0
> or later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
> Failed to load RGtk2 dynamic library, attempting to install it.
> Error in if (choice == 1) { : argument is of length zero
> Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
> Error: package 'gWidgetsRGtk2' could not be loaded
>
> ___
>
>
> Thank you very much in advance. I will appreciate very much any
> suggestions.
>
> Anna Ogan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Wincent Ronggui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with RGtk

2010-02-18 Thread Anna Oganyan
Dear List,
I would like to ask about package RGtk2 with which I have a problem.
I will very much appreciate if somebody could tell me what I need to do.
I need to install a  package scdMicro and it depends on gWidgetsRGtk2.
I am working on MAC, version 10.5.8.
When I try to load gWidgetsRGtk2 (or RGtk2), it asks me:
"Instal GTK+?"

I installed Gtk+ from CRAN, and added a path:

export PATH=$PATH:/Library/Frameworks/GTK+.framework/Resources/bin

But next time I try to load gWidgetsRGtk2 (or RGtk2), I get the same message: 
"Instal GTK+?"  

Below I included the error message I got.
__
Loading required package: gWidgetsRGtk2
Loading required package: gWidgets
Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared library 
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so, 6): 
Library not loaded: 
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
  Referenced from: 
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0 or 
later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
In addition: Warning message:
In fun(...) : couldn't connect to display "/tmp/launch-WrMGBi/:0"
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded
starting httpd help server ... done
Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared library 
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so, 6): 
Library not loaded: 
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
  Referenced from: 
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0 or 
later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package/namespace load failed for 'gWidgetsRGtk2'
Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class = 
"tclObj") : 
  [tcl] invalid command name "option".

Error : .onLoad failed in 'loadNamespace' for 'gWidgetstcltk'
Error: package/namespace load failed for 'gWidgetstcltk'
Loading required package: proto
Loading required package: filehash
Error: package 'filehash' could not be loaded
In addition: Warning message:
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = 
lib.loc) :
  there is no package called 'filehash'
Loading required package: gWidgetsRGtk2
Error in dyn.load(file, DLLpath = DLLpath, ...) : 
  unable to load shared library 
'/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so':
  dlopen(/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so, 6): 
Library not loaded: 
/Library/Frameworks/GTK+.framework/Resources/lib/libgtk-x11-2.0.0.dylib
  Referenced from: 
/Users/aoganyan/Library/R/2.10/library/RGtk2/libs/i386/RGtk2.so
  Reason: Incompatible library version: RGtk2.so requires version 1801.0.0 or 
later, but libgtk-x11-2.0.0.dylib provides version 1401.0.0
Failed to load RGtk2 dynamic library, attempting to install it.
Error in if (choice == 1) { : argument is of length zero
Error : .onLoad failed in 'loadNamespace' for 'RGtk2'
Error: package 'gWidgetsRGtk2' could not be loaded

___


Thank you very much in advance. I will appreciate very much any suggestions.

Anna Ogan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Set Colour of Histogram Bars (lattice)

2010-02-18 Thread DarioAustralia

Hello there,

I have a bunch of histogram bars that I'd like the first to be a certain
colour, second to be another colour, third to be a third colour, and repeat
for all my 39 bars.

I thought this was the way to go, but I get the same cyan coloured bars for
all the bars. I did a vector of 3 colours, hoping it'd recycle them for the
other bars.


> levels(forPlot$type)
[1] "Blah" "Blah 2" "Blah 3" 

trellis.par.set("plot.polygon$col", c("red", "green", "blue"))
histogram( ~ cpgBin:type, data = forPlot, scales = list(x = list(rot = 90)),
main = "CpG Density vs. Number Of Our Regions", xlab = "CpG Bin", ylab =
"Percent of Our Regions")

Can anyone help me here ?

Thanks,
   Dario.


-- 
View this message in context: 
http://n4.nabble.com/Set-Colour-of-Histogram-Bars-lattice-tp1561123p1561123.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread myrmail
I am old enough to have lived through this particular transition.
Prior to the advent of SAS, trials were analyzed by in-house written
programs (usually in Fortran maybe with the help of IMSL). These
programs were huge card decks. Having the card reader eat a card
half way through reading the deck was a not unusual occurrence.

I was responsible for deploying the first version of SAS. This meant
compiling PL/I code stored on a magnetic tape and storing it on limited
and expensive disk drives. It was several years before the transition
from using in-house programs to SAS was completed. Yes there was a
great deal of angst and I spent a lot of time convincing people that
in the end there would be a cost advantage and overcoming institutional
inertia.

By the way, this was all done on computers that you will probably find
only in a museum, if at all. These systems filled whole rooms and required
a staff just to keep them running.

Murray M Cooper, PhD
Richland Statistics
9800 North 24th St
Richland, MI  49083



-Original Message-
>From: "Christopher W. Ryan" 
>Sent: Feb 18, 2010 1:08 PM
>To: r-help@r-project.org
>Cc: p.dalga...@biostat.ku.dk
>Subject: Re: [R] Use of R in clinical trials
>
>Pure Food and Drug Act: 1906
>FDA: 1930s
>founding of SAS: early 1970s
>
>(from the history websites of SAS and FDA)
>
>What did pharmaceutical companies use for data analysis before there was 
>SAS? And was there much angst over the change to SAS from whatever was 
>in use before?
>
>Or was there not such emphasis on and need for thorough data analysis 
>back then?
>
>--Chris
>Christopher W. Ryan, MD
>SUNY Upstate Medical University Clinical Campus at Binghamton
>425 Robinson Street, Binghamton, NY  13904
>cryanatbinghamtondotedu
>
>"If you want to build a ship, don't drum up the men to gather wood, 
>divide the work and give orders. Instead, teach them to yearn for the 
>vast and endless sea."  [Antoine de St. Exupery]
>
>Bert Gunter wrote:
>> DISCLAIMER: This represents my personal view and in no way reflects that of
>> my company.
>> 
>> Warning: This is a long harangue that contains no useful information on R.
>> May be wise to delete without reading. 
>> --
>> 
>> Sorry folks, I still don't understand your comments. As Cody's original post
>> pointed out, there are a host of factors other than ease of programmability
>> or even quality of results that weigh against any change. To reiterate, all
>> companies have a huge infrastructure of **validated SAS code** that would
>> have to be replaced. This, in itself, would take years and cost tens of
>> millions of dollars at least. Also to reiterate, it's not only
>> statistical/reporting functionality but even more the integration into the
>> existing clinical database systems that would have to be rewritten **and
>> validated**. All this would have to be done while continuing full steam on
>> existing submissions. It is therefore not surprising to me that no pharma
>> company in its right mind even contemplates undertaking such an effort.
>> 
>> To put these things into perspective. Let's say Pfizer has 200 SAS
>> programmers (it's probably more, as they are a large Pharma, but I dunno).
>> If each programmer costs, conservatively, $200K U.S. per year fully loaded,
>> that's $40 million U.S. for SAS Programmers. And this is probably a severe
>> underestimate. So the $14M quoted below is chicken feed -- it doesn't even
>> make the radar. 
>> 
>> To add further perspective, a single (large) pivotal clinical trial can
>> easily cost $250M . A delay in approval due to fooling around trying to
>> shift to a whole new software system could easily cause hundreds of million
>> to billions if it means a competitor gets to the marketplace first. So, to
>> repeat, SAS costs are chicken feed.
>> 
>> Yes, I suppose that the present system institutionalizes mediocrity. How
>> could it be otherwise in any such large scale enterprise? Continuity,
>> reliability, and robustness are all orders of magnitude more important for
>> both the FDA and Pharma to get safe and efficacious drugs to the public.
>> Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate
>> here!) would be dangerous, unacceptable, and would probably delay drug
>> approvals. I consider this another example of the Kuhnsian paradigm (Thomas
>> Kuhn: "The Structure of Scientific Revolutions")in action.
>> 
>> This is **not** to say that there is not a useful role for R (or STATA or
>> ...) to play in clinical trial submissions or, more generally, in drug
>> research and development. There certainly is. For the record, I use R
>> exclusively in my (nonclinical statistics) work. Nor is to say that all
>> change must be avoided. That would be downright dangerous. But let's please
>> keep these issues in perspective. One's enthusiasm for R's manifold virtues
>> should not replace common sense and logic. That, too, would be unfortunate.
>> 
>> Since I've freely blustered, I am now a fair target

Re: [R] sql query variable

2010-02-18 Thread RagingJim

Thanks guys. I ended up doing as you suggested Dieter. Thanks for the idea :)
-- 
View this message in context: 
http://n4.nabble.com/sql-query-variable-tp1558189p1561158.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lint for R? and debugging

2010-02-18 Thread luke

On Tue, 16 Feb 2010, Esmail wrote:


On 16-Feb-10 09:03, Karl Ove Hufthammer wrote:

On Tue, 16 Feb 2010 08:00:09 -0500 Esmail  wrote:

And along the same lines, any type of interactive debugging
utility for R?


See this article in R News:

'Debugging Without (Too Many) Tears'
http://cran.r-project.org/doc/Rnews/Rnews_2003-3.pdf#page=29



Thanks for the pointer, that looks very interesting.

Any lint-like utilities out there? I miss a lot of the development
tools I have available for Python or Java with R, esp once the code
starts to grow beyond a few hundred lines.



The codetools package may provide some of what you want.

luke


Esmail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running means

2010-02-18 Thread RagingJim

Ok, took me a while, but I figured it out. Because my running mean had less
years than my standard rainfall graph, when I overlaid the running mean onto
the rainfall it was trying to stretch out. So I just plotted both onto the
same graph., like so:

barplot(Ann,main=title, xlab="Year",ylab="Rainfall (mm)",
ylim=c(0,ymax),col="blue",space=0)
cp<-barplot(new[,2],space=0,col=NA,border=NA,names.arg=NA,add=T)
lines(cp,c(new[,2]),lwd=2)

#"new" is the graph I made using the running mean rainfall data tabulated
with the full year set.
-- 
View this message in context: 
http://n4.nabble.com/prompts-and-running-means-tp1475403p1561105.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Appending Data to A n Existing List...

2010-02-18 Thread Jason Rupert
Thanks for all the replies!  

I think I posted a poor example, but now I understand that you can name entries 
in a list.  That is really neat.  

Before I was just letting them default, which is kind of problematic. 

I think I also came across another way and also used the suggestion from Bert 
Gunter - thanks again: 

> example_list<-list(tracking<-c("house"), house_type<-c("brick", "wood"), 
> sizes<-c(1600, 1800, 2000, 2400))
> example_list
[[1]]
[1] "house"

[[2]]
[1] "brick" "wood" 

[[3]]
[1] 1600 1800 2000 2400

> 
> cost_limits<-c(20.25, 350010.15)
> 
> example_list[[4]]<-cost_limits
> 
> example_list
[[1]]
[1] "house"

[[2]]
[1] "brick" "wood" 

[[3]]
[1] 1600 1800 2000 2400

[[4]]
[1] 20.2 350010.2



> c(example_list,list(CostStuff=cost_limits))
[[1]]
[1] "house"

[[2]]
[1] "brick" "wood" 

[[3]]
[1] 1600 1800 2000 2400

[[4]]
[1] 20.2 350010.2

$CostStuff
[1] 20.2 350010.2





- Original Message 
From: Bert Gunter 
To: Jason Rupert ; R-help@r-project.org
Sent: Thu, February 18, 2010 5:37:09 PM
Subject: RE: [R] Appending Data to A n Existing List...

?c 

but you have to make sure z is a list:

c(pts,z) ## probably is not what you want

c(pts,list(z)) ## probably is, but z will be unnamed

c(pts,list(z=z) ## names z "z"


Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jason Rupert
Sent: Thursday, February 18, 2010 3:16 PM
To: R-help@r-project.org
Subject: [R] Appending Data to A n Existing List...

What steps can be take to append data to a list? 

Suppose I have the following list and want to append a z axist to the list?

pts <- list(x=cars[,1], y=cars[,2])

z<-rnorm(max(dim(cars)))

How would I go about appending z to an existing list? 

Thanks a ton...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Subtracting one based on an If

2010-02-18 Thread Ista Zahn
I think you can just use

CondoLots[CondoLots > 1] <- CondoLots[CondoLots > 1] -1

-Ista

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Subtracting one based on an If

2010-02-18 Thread LCOG1

For the following:

 Bldgid<-c(1000,1000,1000,1001,1002,1003,1003,1003)
 Maplot<-c(2,20001,20002,3,30001,4,40001,40002)
 Area<-c(40,170,160,50,100,100,90,110)
 #Construct Sample dataframe
 MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)

CondoLots_ <- tapply(MultiLotBldgs..$Maplot, MultiLotBldgs..$Bldgid, length)


CondoLots_ Returns:

1000 1001 1002 1003 
   3113 

What i want to do is to subtract 1 from the above for all cases where there
are more than one, so that  CondoLots_ returns:
1000 1001 1002 1003 
   2   112

I have tried 

for(i in 1:length(CondoLots_)){
ifelse(CondoLots_>1){ 
 CondoLots2_<-CondoLots_-1
}

 }
But it doesnt seem to work properly.  I think this should be simple.

Thanks in advance.
JR
-- 
View this message in context: 
http://n4.nabble.com/Subtracting-one-based-on-an-If-tp1561047p1561047.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with multiple plots (mfrow, mar)

2010-02-18 Thread Peter Neuhaus

Greg Snow wrote:


Use outer margins.  Try something like:


par(mfrow=c(3,1), mar=c(0,4,0,2)+0.1, oma=c(5,0,3,0)+0.1 )


Then do your plots without resetting margins.


Thanks. Perfect! This little detail has been bothering me for quite a
while...


Also you can use xaxt='n' rather than axes=FALSE to suppress just the x axis 
and not have to do the y axis and box by hand.


very helpful hint, too - will save me a lot of typing...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Graphics Question

2010-02-18 Thread RICHARD M. HEIBERGER
## The grouped boxplot is one of the features included in the HH package.
## You will need to install HH if you do not yet have the HH package
## A similar example is posted on my website
##http://astro.ocis.temple.edu/~rmh/HH/bwplot-color.pdf

## This is fake data which I hope mimics the structure you hinted at.
## I am using n=10 levels of Factor and 6 Years
## with 5 observations at each Year:Factor value
X <- data.frame(Year=rep(factor(1999:2004), each=10),
Factor=factor(rep(1:10, 6)),
Freq=1:60)
X <- rbind(X,X,X,X,X)
X$Freq <- X$Freq + rnorm(5*60, s=3)

X$YearFactor <- with(X, interaction(Factor, Year))


## install.packages("HH")  ## if you do not yet have the HH package
library(HH)

b1 <- bwplot(Freq ~ Year, data=X,
 panel=panel.bwplot.intermediate.hh,
 col=1:6,
 scales=list(x=list(limits=c(0,7), at=1:6, labels=levels(X$Year))),
 main="Grouped")

b2 <- bwplot(Freq ~ YearFactor, data=X,
 panel=panel.bwplot.intermediate.hh,
 col=rep(1:6, each=10),
 scales=list(x=list(limits=c(0,70)-5, at=seq(10,60,10)-5,
labels=levels(X$Year))),
 main="Ungrouped")

print(b1, split=c(1,1,1,2), more=TRUE)
print(b2, split=c(1,2,1,2))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svm and RMSE

2010-02-18 Thread Amy Hessen

Hi
Thank you very much for your reply.
 
I meant I got different RMSE with different runs to the same program without 
any change for the dataset or the parameters of SVM.
 
This is my code:
 
library(e1071)
readingmydata <- as.matrix(read.delim("mydataset.txt"))
train.x <- readingmydata[,-1]
train.y <- readingmydata[,1]
 
mymodel <- svm(train.x, train.y, cross=10)
summary(mymodel)
 
can you please tell me how I can fix that error?
 
Cheers,
Amy

 
> Date: Tue, 16 Feb 2010 10:33:19 -0500
> Subject: Re: [R] svm and RMSE
> From: mailinglist.honey...@gmail.com
> To: amy_4_5...@hotmail.com
> CC: r-help@r-project.org
> 
> Hi,
> 
> On Fri, Feb 12, 2010 at 3:00 PM, Amy Hessen  wrote:
> >
> > Hi,
> > Every time I run a svm regression program, I got different RMSE value.
> > Could you please tell me what the reason for that?
> 
> Sorry, your question is a bit vague.
> 
> Can you provide an example/code that shows this behavior? Is the
> different RMSE over different folds of cross validation. Over the same
> data? With the same parameters? Is the RMSE significantly different?
> 
> Providing an example that shows this behavior would help.
> 
> Thanks,
> -steve
> 
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
> | Memorial Sloan-Kettering Cancer Center
> | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
  
_
Link all your email accounts and social updates with Hotmail. Find out now

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Appending Data to A n Existing List...

2010-02-18 Thread Bert Gunter
?c 

but you have to make sure z is a list:

c(pts,z) ## probably is not what you want

c(pts,list(z)) ## probably is, but z will be unnamed

c(pts,list(z=z) ## names z "z"


Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jason Rupert
Sent: Thursday, February 18, 2010 3:16 PM
To: R-help@r-project.org
Subject: [R] Appending Data to A n Existing List...

What steps can be take to append data to a list? 

Suppose I have the following list and want to append a z axist to the list?

pts <- list(x=cars[,1], y=cars[,2])

z<-rnorm(max(dim(cars)))

How would I go about appending z to an existing list? 

Thanks a ton...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survey package question

2010-02-18 Thread Thomas Lumley

On Thu, 18 Feb 2010, Richard Valliant wrote:


Should the svyby function be able to work with svyquantile?  I get the
error below ...


It works, but you need to either specify ci=TRUE or keep.var=FALSE.  The 
problem is that svyquantile() by default does not produce standard errors.


svyby(~api00, ~stype, design=dclus1,svyquantile, 
quantile=c(0.25,0.5,.75),ci=TRUE)

  stype   0.25   0.5  0.75  se.0.25   se.0.5  se.0.75
E E 553.00 652.0 729.0 27.74745 37.81025 17.12898
H H 523.00 608.0 699.5 46.01881 65.82488 33.38789
M M 532.75 636.5 696.5 60.78990 43.12310 55.27555

svyby(~api00, ~stype, design=dclus1,svyquantile, quantile=c(0.25,0.5,.75), 
keep.var=FALSE)

  stype statistic1 statistic2 statistic3
E E 553.00  652.0  729.0
H H 523.00  608.0  699.5
M M 532.75  636.5  696.5



A more general question is: can quantiles and their SEs be computed for
subgroups?



You can also use subset(), which is what svyby() does internally


svyquantile(~api00, quantile=c(0.25,0.5,0.75), subset(dclus1, stype=="E"))

  0.25 0.5 0.75
api00  553 652  729


   -thomas


Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Appending Data to A n Existing List...

2010-02-18 Thread David Winsemius


On Feb 18, 2010, at 6:15 PM, Jason Rupert wrote:


What steps can be take to append data to a list?

Suppose I have the following list and want to append a z axist to  
the list?


pts <- list(x=cars[,1], y=cars[,2])

z<-rnorm(max(dim(cars)))

How would I go about appending z to an existing list?


"Appending" is a bit vague, but why not:

pts$z <- z   # waht I think you meant

?  or

pts[[3]] <- z   # not exactly the same and probably not what you wanted


--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2 ecdf from different set of data

2010-02-18 Thread Roslina Zakaria

Hi r-users,
 
I have 2 sets of data and I would like to superimpose this cumulative density 
in one graph.
 
I know how to put the 2 graphs in one same graph but my problem is the data are 
different.
 
> z[1:20]
 [1]  2.02347388  3.19514379  0.05188875  1.41333812  3.50249892  4.34272676  
6.65639581  5.01533819  4.18207505
[10]  2.87615551  2.28202152  0.49431541  0.06570579  5.68352438 11.37222124  
7.29185093  4.32209430  2.65243100
[19]  4.60515744  2.25838285
 
   zgen   dgen
 [1,] 10.04 0.9627
 [2,] 10.05 0.9629
 [3,] 10.06 0.9631
 [4,] 10.07 0.9632
 [5,] 10.08 0.9634
 [6,] 10.09 0.9635
 [7,] 10.10 0.9637
 [8,] 10.11 0.9638
 [9,] 10.12 0.9640
[10,] 10.13 0.9642
[11,] 10.14 0.9643
 
For z data, I can simply plot(ecdf(z)).  But for data gen it is a 2 coulumn 
data, if I use scatter plot I can get the cdf, plot(zgen,dgen).
 
I tried:
 
plot(ecdf(z), verticals= TRUE,do.p = FALSE, main ="CDF for observed and 
simulated weighted sum",
xlab="Weighted sum (mm)", ylab="Cumulative Percent", xlim=c(0,30), xaxs ='i', 
yaxs ='i')
lines(c(zgen,dgen),verticals= TRUE,do.p = FALSE,col.h="red",col.v="red",lwd=2)
legend(10,.2,c("obs","pre"),col=c("black","red"),lwd=c(1,2))
 
I do not get what I want.
So how do I draw the 2 different ecdf on the same graph given 2 different set 
of data.
 
 
Thank you so much for your help.
 
 
 
 

 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Appending Data to A n Existing List...

2010-02-18 Thread Jason Rupert
What steps can be take to append data to a list? 

Suppose I have the following list and want to append a z axist to the list?

pts <- list(x=cars[,1], y=cars[,2])

z<-rnorm(max(dim(cars)))

How would I go about appending z to an existing list? 

Thanks a ton...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Marc Schwartz
On Feb 18, 2010, at 4:26 PM, Christopher W. Ryan wrote:

> Anyone have any recollection of Prophet software, from the National (US) 
> Center for Research Resources?
> 
> --Chris

A quick Google search comes up with a very dated site at Northwestern:

  http://www.basic.northwestern.edu/biotools/prophet.html

and

  http://www.basic.northwestern.edu/statguidefiles/sghome.html


The BBN site is now at:

  http://www.bbn.com/healthcare/

But the only references there to Prophet are historical.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Christopher W. Ryan
Anyone have any recollection of Prophet software, from the National (US) 
Center for Research Resources?


--Chris

Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu

"If you want to build a ship, don't drum up the men to gather wood, 
divide the work and give orders. Instead, teach them to yearn for the 
vast and endless sea."  [Antoine de St. Exupery]


Marc Schwartz wrote:

On Feb 18, 2010, at 3:50 PM, Nordlund, Dan (DSHS/RDA) wrote:


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Peter Dalgaard
Sent: Thursday, February 18, 2010 12:44 PM
To: Douglas Bates
Cc: r-help@r-project.org; Bert Gunter
Subject: Re: [R] Use of R in clinical trials



(Corrections/additional information welcome!)

My recollection is that the BMD programs (which, in a later version,
became BMDP) predated SAS and were specifically for BioMeDical
analysis.

How could I forget those! Yes, my old (as in 1980-1985) boss at the
University hospital even had the manual in the office. It wasn't a
statistical system though, more a suite of single-purpose computer
programs with a rigid control-card specification format.

BTW, they were apparently put in the public domain by UCLA, but I wonder
where they went?


I believe BMDP was bought by SPSS around 1996.  SPSS also purchased Systat in 
that same time period I believe.

Dan



Statistical Solutions has BMDP, along with NCSS and nQuery, etc:

  http://www.statsol.ie/index.php?pageID=6

I was going to reference BMDP from my memories of some folks that used it back 
in the 80's, but was away at a meeting and then I noted Doug's reference to it.


SPSS sold Systat to a group in India (Cranes Software) , which then 
re-constituted Systat Software:

  http://www.systat.com/


StatView was another of those early programs dating from the mid-to-late 80's 
that went through various incarnations, was eventually bought by SAS, which 
then shut it down in favor of JMP.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Marc Schwartz
On Feb 18, 2010, at 3:50 PM, Nordlund, Dan (DSHS/RDA) wrote:

>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
>> Behalf Of Peter Dalgaard
>> Sent: Thursday, February 18, 2010 12:44 PM
>> To: Douglas Bates
>> Cc: r-help@r-project.org; Bert Gunter
>> Subject: Re: [R] Use of R in clinical trials
>> 
>> 
 (Corrections/additional information welcome!)
>>> 
>>> My recollection is that the BMD programs (which, in a later version,
>>> became BMDP) predated SAS and were specifically for BioMeDical
>>> analysis.
>> 
>> How could I forget those! Yes, my old (as in 1980-1985) boss at the
>> University hospital even had the manual in the office. It wasn't a
>> statistical system though, more a suite of single-purpose computer
>> programs with a rigid control-card specification format.
>> 
>> BTW, they were apparently put in the public domain by UCLA, but I wonder
>> where they went?
>> 
> 
> I believe BMDP was bought by SPSS around 1996.  SPSS also purchased Systat in 
> that same time period I believe.
> 
> Dan


Statistical Solutions has BMDP, along with NCSS and nQuery, etc:

  http://www.statsol.ie/index.php?pageID=6

I was going to reference BMDP from my memories of some folks that used it back 
in the 80's, but was away at a meeting and then I noted Doug's reference to it.


SPSS sold Systat to a group in India (Cranes Software) , which then 
re-constituted Systat Software:

  http://www.systat.com/


StatView was another of those early programs dating from the mid-to-late 80's 
that went through various incarnations, was eventually bought by SAS, which 
then shut it down in favor of JMP.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Univariate Normal Mixtures - HELP

2010-02-18 Thread vaibhav dua
Hi,
 
I'm generating a mixture of 2 univariate normal distributions using norMix and 
rnorMix and would like to put a constraint on Mean (Equal). here is my code 
snippet:

library(nor1mix)

X <- norMix(mu=c(50, 60 ), sig2=c(5,10), w=c(0.5, 0.5))

mixData <- rnorMix(1000, X)

plot(mixData, type="l")


I do get the 1000 random numbers in equal proportion but I want to confirm if 
the Mean is same or not

Any help will be highly appreciated

Victor



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster analysis

2010-02-18 Thread Steve_Friedman
Without know what your data set really looks like, I'd look to decision
trees - specifically package rpart and use method = classify.

Your problem may not be appropriate in that environment, but it is hard to
say with limited explanation of issues.

good luck

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Dong He   
 To 
 Sent by:  r-help@r-project.org
 r-help-boun...@r-  cc 
 project.org   
   Subject 
   [R] cluster analysis
 02/18/2010 04:54  
 PM
   
   
   
   




Hi Folks,

I want to apply cluster analysis on a categorical data set,  could you
recommend me some R package and suggestion?

Thanks!

Dong

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot more series or more colors

2010-02-18 Thread joonR

Thanks for helping!
-- 
View this message in context: 
http://n4.nabble.com/plot-more-series-or-more-colors-tp1556751p1560863.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Possible to save workspace image including packages and class definitions?

2010-02-18 Thread Janko Thyson
Hi everyone!

 

Is it possible to save an image of the workspace where 

1)  Packages

2)  Classes

are saved along with the image?

 

Until now I only managed to save an workspace image that contained all
variables (including functions). When loading this image back into a new
session, packages and class defs where missing:

 

save.image(file="C:/temp/blabla.Rdata")

 

Thanks,

Janko

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cluster analysis

2010-02-18 Thread Dong He
Hi Folks,

I want to apply cluster analysis on a categorical data set,  could you
recommend me some R package and suggestion?

Thanks!

Dong

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with multiple plots (mfrow, mar)

2010-02-18 Thread Greg Snow
Use outer margins.  Try something like:

> par(mfrow=c(3,1), mar=c(0,4,0,2)+0.1, oma=c(5,0,3,0)+0.1 )

Then do your plots without resetting margins.

Also you can use xaxt='n' rather than axes=FALSE to suppress just the x axis 
and not have to do the y axis and box by hand.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Peter Neuhaus
> Sent: Thursday, February 18, 2010 10:11 AM
> To: r-help@r-project.org
> Subject: [R] problem with multiple plots (mfrow, mar)
> 
> Dear R-users,
> 
> I often stack plots that have the same x-axis. To save space and have
> the plots themselves as large as possible I like to minimize the
> margins
> between the plots to zero. I use the "mfrow" and "mar" parameters to
> achieve this.
> 
> However, the different margin settings for the individual plots lead to
> the inner plots being higher than the two outer plots. To make the
> data in the individual subplots visually comparable, I would like
> to have all plots with a plotting area of exactly the same height.
> 
> How would that be done in R?
> 
> Here's some example code to illustrate my problem:
> 
> # BEGIN
> x <- 1:10
> a <- runif(10, 0, 5)
> b <- runif(10, 0, 5)
> c <- runif(10, 0, 5)
> 
> ylim <- c(0, 5)
> 
> par(mfrow=c(3,1))
> par(mar=c(0,4.1,2.1,3.1))
> 
> plot(x, a, type="o", ylim=ylim, axes=FALSE)
> axis(1, labels=FALSE)
> axis(2)
> axis(3)
> axis(4)
> box()
> 
> par(mar=c(0,4.1,0,3.1))
> 
> plot(x, b, type="o", ylim=ylim, axes=FALSE)
> axis(1, labels=FALSE)
> axis(2)
> axis(3, labels=FALSE)
> axis(4)
> box()
> 
> par(mar=c(2.1,4.1,0,3.1))
> 
> plot(x, c, type="o", ylim=ylim, axes=FALSE)
> axis(1)
> axis(2)
> axis(3, labels=FALSE)
> axis(4)
> box()
> # END
> 
> Thanks in advance,
> 
> Peter
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Peter Dalgaard
> Sent: Thursday, February 18, 2010 12:44 PM
> To: Douglas Bates
> Cc: r-help@r-project.org; Bert Gunter
> Subject: Re: [R] Use of R in clinical trials
> 
> 
> >> (Corrections/additional information welcome!)
> >
> > My recollection is that the BMD programs (which, in a later version,
> > became BMDP) predated SAS and were specifically for BioMeDical
> > analysis.
> 
> How could I forget those! Yes, my old (as in 1980-1985) boss at the
> University hospital even had the manual in the office. It wasn't a
> statistical system though, more a suite of single-purpose computer
> programs with a rigid control-card specification format.
> 
> BTW, they were apparently put in the public domain by UCLA, but I wonder
> where they went?
> 

I believe BMDP was bought by SPSS around 1996.  SPSS also purchased Systat in 
that same time period I believe.

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear contrained optimization

2010-02-18 Thread Ravi Varadhan
Hi,

I have a nonlinear constrained optimization code that can handle nonlinear
(and linear) constraints.  Send me an email if you are interested.

There is also a package available on R-forge, called Rsolnp.

Hope this is helpful,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Brandon Zicha
Sent: Wednesday, February 17, 2010 10:53 AM
To: r-help@r-project.org
Subject: [R] non-linear contrained optimization

All,

I have searched the previous help boards and discovered the problem  
with Rdonlp2 - Specifically, its non-availability. I thought that this  
was my solution, but perhaps there is a better way that you all could  
help me with.  I imagine that this problem is trivial to people such  
as the experts on this mailing list.

I am trying to solve this problem over and over again in a simulation:

I want to find the values of x and y which minimize
f(x,y) = sqrt((z-x)^2+(w-y)^2

subject to the constraints:
0=< sqrt((z2-x)^2+(w2-y)^2) - d2
0=< sqrt((z3-x)^2+(w3-y)^2) - d3
.
0=< sqrt((zk-x)^2+(wk-y)^2) - dk

where zi, wi, di are known scalars.

I would appreciate any help with how to implement this in R.

Many thanks,

Brandon Z.

University of Antwerp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Graphics Question

2010-02-18 Thread Lanna Jin
Hello All,

I'm having some issues controlling graphics in R.  I was wondering if anyone
may help me tackle this problem:

Given a data frame "X" with variables "Year", "Factor" (w/ n groups), and
"Freq"

How do I create a single graphic with the following plots aligned in a
vertical stack?

1. box plot in the fashion Freq~Year|Factor
2. box plot ignoring Factor, Freq~Year

Using the following code:

library(lattice)
tp1<-bwplot(Freq~Year,data=X,main="All
data",ylab="Freq",xlab="Year",layout=c(1,1))
tp2<-bwplot(Freq~Year|Factor,data=X,ylab="Freq",xlab="Year",layout=c(1,2))
plot(tp1,split=c(1,1,1,2)); plot(tp2,split=c(1,2,1,2),newpage=FALSE)

...I've managed to get a graphic that looks like two unrelated images on one
screen: one image with the plot without taking Factor into account; and the
other image, a smooshed lattice plot consisting of n plots in one column
with n rows.  The axis are not aligned and scaled differently (in the second
image the individual plots are half the height of in the first image).
How can I fix this?  It would be nice to create a graphic with all of the
plots stacked on top of each other with the same axis dimensions and
ranges.

Thanks for your help in advance!

-- 
Lanna Jin

lanna...@gmail.com
510-898-8525

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to execute R script piece by piece on SSH terminal

2010-02-18 Thread Vojtěch Zeisek
Hello,
if You run R in terminal, You can type R to start the R software, then use 
command source("/path/to/your/script") to load the script and then You call 
functions from the script as usual. Optionally, You can move to the directory 
with script with command setwd("/path/to/the/directory"). Or, I would 
recommend to install Rkward (http://rkward.sourceforge.net/), what is 
excellent graphical user interface for R.
Best regards,
Vojtěch Zeisek

Dne Čt 18. února 2010 21:36:34 xin...@stat.psu.edu napsal(a):
> hi, I am new to Linux and R environment. I have a existing R script. I
> wonder how to open my R script on Linux platform and execute selected
> written R command?
> 
> thanks
> 
> Xin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>  http://www.R-project.org/posting-guide.html and provide commented,
>  minimal, self-contained, reproducible code.
-- 
Vojtěch Zeisek

Komunita openSUSE GNU/Linuxu /
Community of the openSUSE GNU/Linux

http://www.opensuse.org/
http://web.natur.cuni.cz/~zeisek/


signature.asc
Description: This is a digitally signed message part.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to execute R script piece by piece on SSH terminal

2010-02-18 Thread Sarah Goslee
At the very simplest level, you can open the R script in a text editor.
Then start R at the command line in the terminal by typing
R
and then copy and paste the desired lines from the text file into the
terminal.

There are plenty of more elegant and complex solutions, but that
one is easy and will work.

Sarah

On Thu, Feb 18, 2010 at 3:36 PM,   wrote:
> hi, I am new to Linux and R environment. I have a existing R script. I
> wonder how to open my R script on Linux platform and execute selected
> written R command?
>
> thanks
>
> Xin
>
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme - incorporating measurement error with estimated V-C matrix

2010-02-18 Thread Todd Ogden

I have data (each Y_i is a vector) in the form of

Y_i = X_i \beta_i   + Z_i b_i + epsilon_i

Were it not for the measurement error (the epsilon_i) it's a very  
simple model --- nice and balanced, compound symmetry, and I'd just  
use lme(y ~ x1 + x2, random=~1|subj, ...) but the measurement error is  
throwing me off.


Because the Y_i are actually derived from other data, I am able to do  
some bootstrapping and get an estimate of the V-C matrix of epsilon_i.


But I haven't been able to figure out how to weight the observations  
properly in an lme() call.


Some searching of the archives led me to a 2004 posting (courtesy of  
Dave Atkins) of two functions written by Jose: varRan and varWithin.   
This gives me some hope (a good deal of hope, actually), but I can't  
understand the arguments or how to use these functions.


Here's the posting:
http://tolstoy.newcastle.edu.au/R/help/04/04/0245.html

Any hints would be greatly appreciated.

Thanks,

Todd

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Peter Dalgaard



(Corrections/additional information welcome!)


My recollection is that the BMD programs (which, in a later version,
became BMDP) predated SAS and were specifically for BioMeDical
analysis.  


How could I forget those! Yes, my old (as in 1980-1985) boss at the 
University hospital even had the manual in the office. It wasn't a 
statistical system though, more a suite of single-purpose computer 
programs with a rigid control-card specification format.


BTW, they were apparently put in the public domain by UCLA, but I wonder 
where they went?


--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to execute R script piece by piece on SSH terminal

2010-02-18 Thread xinwei
hi, I am new to Linux and R environment. I have a existing R script. I
wonder how to open my R script on Linux platform and execute selected
written R command?

thanks

Xin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting values from a list

2010-02-18 Thread chipmaney


Thanks, as a follow-up, how do i extract the list element name (ie, 4-2 or 44-1)

thanks,

chipper

Date: Thu, 18 Feb 2010 11:56:45 -0800
From: ml-node+1560750-540257540-69...@n4.nabble.com
To: chipma...@hotmail.com
Subject: Re: Extracting values from a list



Try this:


sapply(x, '[', 'p.value')


On Thu, Feb 18, 2010 at 5:21 PM, chipmaney <[hidden email]> wrote:

>

> I have run a kruskal.test() using the by() function, which returns a list of

> results like the following (subset of results):

>

> Herb.df$ID: 4-2

>   Kruskal-Wallis chi-squared = 18.93, df = 7, p-value = 0.00841

> 

> Herb.df$ID: 44-1

>Kruskal-Wallis chi-squared = 4.43, df = 6, p-value = 0.6187

>

>

> So then, how do extract a vector of p-values (i.e., result$p.value) for

> every element in the list?

>

>

>

> --

> View this message in context: 
> http://n4.nabble.com/Extracting-values-from-a-list-tp1560701p1560701.html
> Sent from the R help mailing list archive at Nabble.com.

>

> __

> [hidden email] mailing list

> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

>



-- 

Henrique Dallazuanna

Curitiba-Paraná-Brasil

25° 25' 40" S 49° 16' 22" O


__

[hidden email] mailing list


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







View message @ 
http://n4.nabble.com/Extracting-values-from-a-list-tp1560701p1560750.html


To unsubscribe from Extracting values from a list, click here.


  
_
Hotmail: Trusted email with powerful SPAM protection.

-- 
View this message in context: 
http://n4.nabble.com/Extracting-values-from-a-list-tp1560701p1560752.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting values from a list

2010-02-18 Thread Henrique Dallazuanna
Try this:

sapply(x, '[', 'p.value')

On Thu, Feb 18, 2010 at 5:21 PM, chipmaney  wrote:
>
> I have run a kruskal.test() using the by() function, which returns a list of
> results like the following (subset of results):
>
> Herb.df$ID: 4-2
>       Kruskal-Wallis chi-squared = 18.93, df = 7, p-value = 0.00841
> 
> Herb.df$ID: 44-1
>        Kruskal-Wallis chi-squared = 4.43, df = 6, p-value = 0.6187
>
>
> So then, how do extract a vector of p-values (i.e., result$p.value) for
> every element in the list?
>
>
>
> --
> View this message in context: 
> http://n4.nabble.com/Extracting-values-from-a-list-tp1560701p1560701.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] popbio and stochastic lambda calculation

2010-02-18 Thread Chris Stubben


Shawn Morrison-2 wrote:
> 
> # The paper reports a 95% CI of 0.79 - 1.10
> # "My" reproduced result for the CIs is much larger, especially on the 
> upper end. Why would this be?
> # The authors report using the 'delta' method (Caswell, 2001) to 
> calculate the CI in which the
> 
> 

Shawn,

I probably can't help much with the vitalsim example, but I would check Box
8.10 in Morris and Doak (2002).

I do have a few ideas about the delta method below.

# List the vital rates

vr<-list(cub= 0.64, yly = 0.67, sub=0.765, adt=0.835, mx=0.467)

# and the matrix using an expression in R

el<- expression(
 0,  0,  0,  0,  adt*mx, 
 cub,0,  0,  0,  0, 
 0,  yly,0,  0,  0, 
 0,  0,  sub,0,  0, 
 0,  0,  0,  sub,adt)

# this should get the projection matrix

A<-matrix( sapply( el, eval, vr), nrow=5, byrow=TRUE)

lambda(A)
[1] 0.9534346

# use the vitalsens function to get the vital rate sensitivites and
  save the second column

vitalsens(el, vr)
estimate sensitivity elasticity
cub0.640   0.1236186 0.08298088
yly0.670   0.1180835 0.08298088
sub0.765   0.2068390 0.16596176
adt0.835   0.7628261 0.66807647
mx 0.467   0.1694131 0.08298088

sens<-vitalsens(el, vr)[,2]


# I'm not sure about the covariance matrix next.  In Step 7 in Slakski et al
2007 ("Calculating the variance of the finite rate of population change from
a matrix model in Mathematica") they just use the square of the standard
errors, so I'll do the same...

se<-list(cub= 0.107, yly = 0.142, sub=0.149, adult=0.106, mx=0.09)
cov<-diag(unlist(se)^2)

## and then the variance of lambda from step 8
var<-t(sens) %*% ( cov%*%sens)
[,1]
[1,] 0.008176676

# and the confidence intervals

lambda(A) - 1.96*sqrt(var)
lambda(A) + 1.96*sqrt(var)

CI of 0.78 and 1.13 is close to paper

Hope that helps,

Chris




-- 
View this message in context: 
http://n4.nabble.com/popbio-and-stochastic-lambda-calculation-tp1557647p1560745.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting values from a list

2010-02-18 Thread Paul Hiemstra

chipmaney schreef:

I have run a kruskal.test() using the by() function, which returns a list of
results like the following (subset of results):

Herb.df$ID: 4-2
   Kruskal-Wallis chi-squared = 18.93, df = 7, p-value = 0.00841

Herb.df$ID: 44-1
Kruskal-Wallis chi-squared = 4.43, df = 6, p-value = 0.6187


So then, how do extract a vector of p-values (i.e., result$p.value) for
every element in the list?



  
If result$p.value normally returns the p value from a kruskal.test 
result you can probably do something like:


vector_pvalues = sapply(result_from_by_list, function(x) x$p.value)

cheers,
Paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Bert Gunter
Yes, Doug is correct and I'm wrong. In fact, his comment jogged MY memory --
I actually used BMDP a bit in the late 70's(I think it was).

Thanks to Doug for corrected chronology.

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jeff Laake
Sent: Thursday, February 18, 2010 11:28 AM
To: r-help@r-project.org
Subject: Re: [R] Use of R in clinical trials

I am old enough.  Memory isn't always reliable but Doug Bates recounting 
is what I remember and a quick search has BMDP developed in 1961 and SAS 
in 1966.  To my surprise, the search produced a site that offered BMDP 
for sale.

On 2/18/2010 11:15 AM, Peter Dalgaard wrote:
> Christopher W. Ryan wrote:
>> Pure Food and Drug Act: 1906
>> FDA: 1930s
>> founding of SAS: early 1970s
>>
>> (from the history websites of SAS and FDA)
>>
>> What did pharmaceutical companies use for data analysis before there 
>> was SAS? And was there much angst over the change to SAS from 
>> whatever was in use before?
>>
>> Or was there not such emphasis on and need for thorough data analysis 
>> back then?
>
> Well, I'm not quite old enough for this, but the story I heard is that 
> before SAS was the desktop calculator, essentially. Statistics had 
> correspondingly enormous focus on balanced designs, allowing 
> computation to be reduced to means and sums of squares. This would 
> typically be left to consulting firms employing (human) computers to 
> literally do the sums. Digital computers had of course been around for 
> decades at the time but mostly for hard core physics. (Well, actually, 
> they were finding their way into accounting too.) So SAS was, I 
> expect, pretty uniformly a relief.
>
> At the same time, the requirements of the FDA have been tightening; I 
> suppose partly due to technological feasibility, partly in response to 
> certain practises being recognised as dubious, like selective 
> publication, multiple testing, etc. And more data are required since 
> new drugs are rarely all that much better than older ones, while the 
> worries about side effects have increased.
>
>
>> --Chris
>> Christopher W. Ryan, MD
>> SUNY Upstate Medical University Clinical Campus at Binghamton
>> 425 Robinson Street, Binghamton, NY  13904
>> cryanatbinghamtondotedu
>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression - what is being predicted when using

2010-02-18 Thread Ted Harding
On 18-Feb-10 18:58:57, Dimitri Liakhovitski wrote:
> Dear gurus,
> I've analyzed a (fake) data set ("data") using logistic regression
> (glm):
> 
> logreg1 <- glm(z ~ x1 + x2 + y, data=data, family=binomial("logit"),
> na.action=na.pass)
> 
> Then, I created a data frame with 2 fixed levels (0 and 1) for each
> predictor:
> 
> attach(data)
> x1<-c(0,1)
> x2<-c(0,1)
> y<-c(0,1)
> newdata1<-data.frame(expand.grid(x1,x2,y))
> names(newdata1)<-c("x1","x2","y")
> 
> Finally, I calculated model-predicted probabilities for each
> combination of those fixed levels:
> 
> newdata1$predicted <-predict(logreg1,newdata=newdata1, type="response")
> 
> I am pretty sure the results I get (see the table below) are actual
> probabilities. But just in case - could someone please confirm that
> these are probabilities rather than log odds or odds?
> Thanks a lot!
> 
> x1 x2 y predicted
> 1  0  0 0   0.08700468
> 2  1  0 0   0.19262901
> 3  0  1 0   0.27108334
> 4  1  1 0   0.48216220
> 5  0  0 1   0.53686154
> 6  1  0 1   0.74373367
> 7  0  1 1   0.81896484
> 8  1  1 1   0.91887072
> -- 
> Dimitri Liakhovitski

Yes, they are predicted probabilities of response Z=1.
You specified this by setting 'type="response"'.

See ?predict.glm (the method for 'predict' which is used for GLMs).

Ted.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 18-Feb-10   Time: 19:38:54
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Jeff Laake
I am old enough.  Memory isn't always reliable but Doug Bates recounting 
is what I remember and a quick search has BMDP developed in 1961 and SAS 
in 1966.  To my surprise, the search produced a site that offered BMDP 
for sale.


On 2/18/2010 11:15 AM, Peter Dalgaard wrote:

Christopher W. Ryan wrote:

Pure Food and Drug Act: 1906
FDA: 1930s
founding of SAS: early 1970s

(from the history websites of SAS and FDA)

What did pharmaceutical companies use for data analysis before there 
was SAS? And was there much angst over the change to SAS from 
whatever was in use before?


Or was there not such emphasis on and need for thorough data analysis 
back then?


Well, I'm not quite old enough for this, but the story I heard is that 
before SAS was the desktop calculator, essentially. Statistics had 
correspondingly enormous focus on balanced designs, allowing 
computation to be reduced to means and sums of squares. This would 
typically be left to consulting firms employing (human) computers to 
literally do the sums. Digital computers had of course been around for 
decades at the time but mostly for hard core physics. (Well, actually, 
they were finding their way into accounting too.) So SAS was, I 
expect, pretty uniformly a relief.


At the same time, the requirements of the FDA have been tightening; I 
suppose partly due to technological feasibility, partly in response to 
certain practises being recognised as dubious, like selective 
publication, multiple testing, etc. And more data are required since 
new drugs are rarely all that much better than older ones, while the 
worries about side effects have increased.




--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting values from a list

2010-02-18 Thread chipmaney

I have run a kruskal.test() using the by() function, which returns a list of
results like the following (subset of results):

Herb.df$ID: 4-2
   Kruskal-Wallis chi-squared = 18.93, df = 7, p-value = 0.00841

Herb.df$ID: 44-1
Kruskal-Wallis chi-squared = 4.43, df = 6, p-value = 0.6187


So then, how do extract a vector of p-values (i.e., result$p.value) for
every element in the list?



-- 
View this message in context: 
http://n4.nabble.com/Extracting-values-from-a-list-tp1560701p1560701.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Cody Hamilton
Points regarding the advantages of LaTex are very well taken. If I were 
fortunate enough to have complete ownership of the document (as might be the 
case with a DSMB report produced by the Biostat group), then LaTex would be a 
wonderful choice.  Though I am not a LaTex user, I can easily imagine that the 
productivity gains could be considerable.

Unfortunately, in most cases the Biostatistics group is responsible for 
providing a relatively small piece of the overall document which is owned by 
another group that inevitably uses MS Office.


--- On Wed, 2/17/10, Erik Iverson  wrote:

> From: Erik Iverson 
> Subject: Re: [R] Use of R in clinical trials
> To: "Frank E Harrell Jr" 
> Cc: "Cody Hamilton" , r-help@r-project.org
> Date: Wednesday, February 17, 2010, 9:05 PM
> Frank E Harrell Jr wrote:
> > Cody,
> > 
> > How amazing that SAS is still used to produce reports
> that reviewers hate and that requires tedious low-level
> programming.  R + LaTeX has it all over that approach
> IMHO.  We have used that combination very successfully
> for several data and safety monitoring reporting tasks for
> clinical trials for the pharmaceutical industry.
> > 
> > Frank
> 
> I used to work for a research group that also used R +
> LaTeX to produce DSMB reports for clinical trials.  If
> the DSMB members had only been exposed to SAS reports
> before, you could not get them to stop praising the quality
> of the R + LaTeX reports, even years into a trial.
> 
> Erik
>





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Peter Dalgaard

Christopher W. Ryan wrote:

Pure Food and Drug Act: 1906
FDA: 1930s
founding of SAS: early 1970s

(from the history websites of SAS and FDA)

What did pharmaceutical companies use for data analysis before there was 
SAS? And was there much angst over the change to SAS from whatever was 
in use before?


Or was there not such emphasis on and need for thorough data analysis 
back then?


Well, I'm not quite old enough for this, but the story I heard is that 
before SAS was the desktop calculator, essentially. Statistics had 
correspondingly enormous focus on balanced designs, allowing computation 
to be reduced to means and sums of squares. This would typically be left 
to consulting firms employing (human) computers to literally do the 
sums. Digital computers had of course been around for decades at the 
time but mostly for hard core physics. (Well, actually, they were 
finding their way into accounting too.) So SAS was, I expect, pretty 
uniformly a relief.


At the same time, the requirements of the FDA have been tightening; I 
suppose partly due to technological feasibility, partly in response to 
certain practises being recognised as dubious, like selective 
publication, multiple testing, etc. And more data are required since new 
drugs are rarely all that much better than older ones, while the worries 
about side effects have increased.




--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu




--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Douglas Bates
On Thu, Feb 18, 2010 at 12:36 PM, Bert Gunter  wrote:
> The key dates are 1938 and 1962. The FDC act of 1938 essentially mandated
> (demonstration of) safety. The tox testing infrastructure grew from that.At
> that time, there were no computers, little data, little statistics
> methodology. Statistics played little role -- as is still mainly the case
> today for safety. Any safety findings whatever in safety testing raise a
> flag; statistical significance in the multiple testing framework is
> irrelevant.

> 1962 saw the Kefauver-Harris Amendments that mandated demonstration of
> efficacy. That was the key. The whole clinical trial framework and the
> relevant statistical design and analysis infrastructure flowed from that
> regulatory requirement. SAS's development soon after was therefore the first
> direct response to the statistical software needs that resulted. Note also,
> that statistical software was in its infancy at this time: before SAS there
> was Fortran and COBOL; there was no statistical software.

> So, as you can see, there essentially was **no** "before SAS".

> (Corrections/additional information welcome!)

My recollection is that the BMD programs (which, in a later version,
became BMDP) predated SAS and were specifically for BioMeDical
analysis.  Early statistical software was oriented to applications
areas: SPSS (Statistical Package for the Social Sciences) was the
predominant system used in the social sciences, BMD(P) in biomedical
areas and SAS in agricultural/life sciences settings.  Eventually the
more coherent framework and comparative ease-of-use of SAS (yes, I am
saying that with a straight face - in the days of batch jobs submitted
on punched cards with data residing on magnetic tape, there were
different standards of ease-of-use) won over more users in medical
fields.


> Bert Gunter
> Genentech Nonclinical Biostatistics
>
>
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Christopher W. Ryan
> Sent: Thursday, February 18, 2010 10:09 AM
> To: r-help@r-project.org
> Cc: p.dalga...@biostat.ku.dk
> Subject: Re: [R] Use of R in clinical trials
>
> Pure Food and Drug Act: 1906
> FDA: 1930s
> founding of SAS: early 1970s
>
> (from the history websites of SAS and FDA)
>
> What did pharmaceutical companies use for data analysis before there was
> SAS? And was there much angst over the change to SAS from whatever was
> in use before?
>
> Or was there not such emphasis on and need for thorough data analysis
> back then?
>
> --Chris
> Christopher W. Ryan, MD
> SUNY Upstate Medical University Clinical Campus at Binghamton
> 425 Robinson Street, Binghamton, NY  13904
> cryanatbinghamtondotedu
>
> "If you want to build a ship, don't drum up the men to gather wood,
> divide the work and give orders. Instead, teach them to yearn for the
> vast and endless sea."  [Antoine de St. Exupery]
>
> Bert Gunter wrote:
>> DISCLAIMER: This represents my personal view and in no way reflects that
> of
>> my company.
>>
>> Warning: This is a long harangue that contains no useful information on R.
>> May be wise to delete without reading.
>> --
>>
>> Sorry folks, I still don't understand your comments. As Cody's original
> post
>> pointed out, there are a host of factors other than ease of
> programmability
>> or even quality of results that weigh against any change. To reiterate,
> all
>> companies have a huge infrastructure of **validated SAS code** that would
>> have to be replaced. This, in itself, would take years and cost tens of
>> millions of dollars at least. Also to reiterate, it's not only
>> statistical/reporting functionality but even more the integration into the
>> existing clinical database systems that would have to be rewritten **and
>> validated**. All this would have to be done while continuing full steam on
>> existing submissions. It is therefore not surprising to me that no pharma
>> company in its right mind even contemplates undertaking such an effort.
>>
>> To put these things into perspective. Let's say Pfizer has 200 SAS
>> programmers (it's probably more, as they are a large Pharma, but I dunno).
>> If each programmer costs, conservatively, $200K U.S. per year fully
> loaded,
>> that's $40 million U.S. for SAS Programmers. And this is probably a severe
>> underestimate. So the $14M quoted below is chicken feed -- it doesn't even
>> make the radar.
>>
>> To add further perspective, a single (large) pivotal clinical trial can
>> easily cost $250M . A delay in approval due to fooling around trying to
>> shift to a whole new software system could easily cause hundreds of
> million
>> to billions if it means a competitor gets to the marketplace first. So, to
>> repeat, SAS costs are chicken feed.
>>
>> Yes, I suppose that the present system institutionalizes mediocrity. How
>> could it be otherwise in any such large scale enterprise? Continuity,
>> reliability, and robustness are all orders of magnitude 

Re: [R] Use of R in clinical trials

2010-02-18 Thread Rolf Turner

On 19/02/2010, at 1:12 AM, John Sorkin wrote:

> It is easy to devolve into visceral response mode, lose objectivity and slip 
> into intolerance. R, S, S-Plus, SAS, PASW (nee SPSS), STATA, are all tools. 
> Each has strengths and weaknesses. No one is inherently better, or worse than 
> the other. The quality of the results produced by anyone of them is a 
> function of the abilities of the person who manipulates them. Don't expect 
> quality work from any program unless the person running the program knows 
> what he, or she is doing!  


I think this sort of ``moral relativism'' is specious.  It is certainly true 
that the
programmer or analyst has to know what he or she is doing irrespective of 
package or
language.  And SAS at least has some relative strengths in respect of its 
capacity to
handle large data sets.  But overall, in terms of having an effective 
environment in
which to conduct statistical analyses, there can be no question that the 
R/S/S-Plus
group win hands down.

cheers,

Rolf Turner
##
Attention: 
This e-mail message is privileged and confidential. If you are not the 
intended recipient please delete the message and notify the sender. 
Any views or opinions presented are solely those of the author.

This e-mail has been scanned and cleared by MailMarshal 
www.marshalsoftware.com
##

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svm regression

2010-02-18 Thread Uwe Ligges



On 18.02.2010 19:43, madhu sankar wrote:

Hi,

I am having trouble with svm regression.it is not giving the right results.

example


model<- svm(dataTrain,classTrain,type="eps-regression")
predict(model, dataTest)

 36 37 38 39 40 41
42
-13.838257  -1.475401  10.502739  -3.047656  -8.713697   3.812873
1.741999
 43 44 45 46 47 48
49
  -6.034361 -13.469742   7.628642 -22.197060  -3.417444  -8.536890
-11.876133
 50
  -5.877457


My dataSet has 50 columns and 19 rows
my classSet has 50 columns and 1 row

My dataTrain has 35(1:35) columns and 19 rows
My classTrain has 35(1:35) columns and 1 row

My dataTest has 15(36:50) columns and 19 rows
My classTest has 15(36:50) columns and 1 row



Same problems as in my last mail:

I fear you are mixing up several things: regression vs. classification, 
rows vs. columns




My results should be as follows:

  [1] -25.70  30.30 -58.50  -1.12   7.62 -16.10 -48.50  21.10  12.60 -43.00
[11] -47.30 -47.90 -38.40 -21.30  22.40


Why do you know?

Uwe Ligges



But instead i get the wrong values.can anyone help me with it.

Thanks,
Joji.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svm regression

2010-02-18 Thread Uwe Ligges



On 18.02.2010 17:54, madhu sankar wrote:

Hi,

I am trying to use svm  for regression data.

this is how my data looks like:


dataTrain


x  y   z
1  4  6
2  5  4
3  7  5


classTrain


a
2
3
4


dataTest


x y  z
1 7  2
2 8  3


classTest


a
3
4
5


building the model

model<-svm(dataTrain,classTrain,type="nu-regression")
pred<- predict(model, dataTest)


pred

12
3.008842 3.120078

I don't understand what the above value means.i need the results similar to
classTest.



They are similar, aren't they?
You talk about svm *regression* rather than classification and call your 
object "class." which makes me worry you failed to understand some 
basics of the method.


Uwe Ligges





Thanks,
Joji.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Natural Language Processing of R help archives

2010-02-18 Thread Harsh
Hi useRs,
This is not so much a help request as it is a request for feedback about the
possibilities of using Natural Language Processing (NLP) techniques on the
r-help archives for a more 'effective' retrieval of answers.
A few points that may capture what I'm trying to get at:

1) R has an emerging set of packages for NLP and text mining. Is it possible
to provide the r-help archives (for a certain period of course) as a text
corpus.
2) It is easy to search for R problems and get 'exact' results via Google
search. rseek.org does a great job too. Would a semantic parser provide more
accurate results?
3) This effort probably falls under the Question-Answer modeling domain in
NLP, which is an area of application being used in chat automation,
chat-bots and to improve response prediction when a human interacts with a
query system.

I would like to use some of the NLP tool-kits available in Java and Python
with the r-help archive data and maybe create an application that uses
semantic filtering for query and search.

I'd appreciate knowing what others think about such an undertaking.

Thank you.
Regards,
Harsh Singhal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression - what is being predicted when using predict - probabilities or odds?

2010-02-18 Thread Dimitri Liakhovitski
Dear gurus,
I've analyzed a (fake) data set ("data") using logistic regression (glm):

logreg1 <- glm(z ~ x1 + x2 + y, data=data, family=binomial("logit"),
na.action=na.pass)

Then, I created a data frame with 2 fixed levels (0 and 1) for each predictor:

attach(data)
x1<-c(0,1)
x2<-c(0,1)
y<-c(0,1)
newdata1<-data.frame(expand.grid(x1,x2,y))
names(newdata1)<-c("x1","x2","y")

Finally, I calculated model-predicted probabilities for each
combination of those fixed levels:

newdata1$predicted <-predict(logreg1,newdata=newdata1, type="response")

I am pretty sure the results I get (see the table below) are actual
probabilities. But just in case - could someone please confirm that
these are probabilities rather than log odds or odds?
Thanks a lot!

x1 x2 y predicted
1  0  0 0   0.08700468
2  1  0 0   0.19262901
3  0  1 0   0.27108334
4  1  1 0   0.48216220
5  0  0 1   0.53686154
6  1  0 1   0.74373367
7  0  1 1   0.81896484
8  1  1 1   0.91887072
-- 
Dimitri Liakhovitski

dimitri.liakhovit...@ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread Henrik Bengtsson
What does

print(as.integer(branches))

give?  That is what rep() uses.

/H

On Thu, Feb 18, 2010 at 7:49 PM, dkStevens  wrote:
>
> I could send the entire bit of code but I was hoping that someone would
> recognize the issue from past experience. I may be an artifact of other
> parts of the code. I observed the problem in a larger context and cut
> out all that you see below. The comments were next to the results to
> clarify. Apparently I failed to be clear.
>
> Eik Vettorazzi [via R] wrote:
>> Sorry, not reproducible. This works for me (as expected):
>>
>> branches<-c(5,6)
>> iInd = 1
>>   for(i in 1:length(branches)) {
>>     print((1:branches[i])+iInd-1)   # iInd is a position shift of the
>> index
>>     ni = branches[i]
>>     print(i)
>>     print(ni)
>>     print(c(ni,rep(i,times=ni)))
>> # ... some interesting other stuff for my project that gets wrecked
>> because of this issue
>>     iInd = iInd + branches[i]
>> }
>>
>>
>> dkStevens schrieb:
>>
>> > I'm observing odd behavior of the rep(...) procedure when using
>> variables as
>> > parameters in a loop. Here's a simple loop on a vector 'branches'
>> that is
>> > c(5,6,5,5,5). The statement in question is
>> > print(c(ni,rep(i,times=ni)))
>> >
>> > that works properly first time through the loop but the second time,
>> when
>> > branches[2] = 6, only prints 5 values of i.
>> >
>> > Any ideas, anyone?
>> >
>> >   iInd = 1
>> >   for(i in 1:length(branches)) {
>> >     print((1:branches[i])+iInd-1)   # iInd is a position shift of
>> the index
>> >     ni = branches[i]
>> >     print(i)
>> >     print(ni)
>> >     print(c(ni,rep(i,times=ni)))
>> > # ... some interesting other stuff for my project that gets wrecked
>> because
>> > of this issue
>> >     iInd = iInd + branches[i]
>> > }
>> >
>> > # first pass through loop
>> > [1] 1 2 3 4 5             # branches[1] + iInd - 1 content
>> > [1] 1                        # i value to repeat 5 times
>> > [1] 5                        # ni = 5 on 1st pass
>> > [1] 5 1 1 1 1 1           # five values - 1 1 1 1 1 - OK
>> >
>> > # second pass through loop
>> > [1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
>> > [1] 2                       # i value to repeat 6 times
>> > [1] 6                       # ni = 6 on 2nd pass
>> > [1] 6 2 2 2 2 2          # 6 'twos' but only shows 5 'twos'
>> > print(c(ni,rep(i,times=ni))) - why not 6?
>> >
>> >
>> >
>>
>> --
>> Eik Vettorazzi
>> Institut für Medizinische Biometrie und Epidemiologie
>> Universitätsklinikum Hamburg-Eppendorf
>>
>> Martinistr. 52
>> 20246 Hamburg
>>
>> T ++49/40/7410-58243
>> F ++49/40/7410-57790
>>
>> __
>> [hidden email]
>> 
>> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> 
>> View message @
>> http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560587.html
>>
>> To unsubscribe from Funny result from rep(...) procedure, click here
>> < (link removed) =>.
>>
>>
>
> --
> David K Stevens, P.E., Ph.D., Professor
> Civil and Environmental Engineering
> Utah Water Research Laboratory
> 8200 Old Main Hill
> Logan, UT  84322-8200
> 435 797 3229 - voice
> 435 797 1363 - fax
> david.stev...@usu.edu
>
>
>
>
> --
> View this message in context: 
> http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560643.html
> Sent from the R help mailing list archive at Nabble.com.
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Building R from source

2010-02-18 Thread Uwe Ligges

On 17.02.2010 15:38, rkevinbur...@charter.net wrote:

Thank you for the tip. I was used to inserting write statements and was 
surpised when it didn't work and reading this section I see that I shouldn't 
have been doing this anyway.

One more question. Is there another call that I can use to print out a 
2-dimensional array? Since FORTRAN stores as column major it is hard to print 
out an array by row as 'R' does. Also it doesn't mention how to not specify the 
data to be printed out so that just the label is printed. It (the manual) just 
says it is possible.



No idea, to be honest: I never did that from Fortran if I have R. I 
would always pass back to R and print from R since in most cases the 
user probably want to have the printed data in R anyway in the end.


Best,
Uwe



Thanks again.

Kevin

 Uwe Ligges  wrote:

Since no more information is given:
See Writing R Extensions, currently section 6.5: "Printing" and 6.5.1
"Printing from FORTRAN"


Uwe Ligges



On 17.02.2010 03:50, rkevinbur...@charter.net wrote:

I found the problem but not a solution. It turns out if I add the following 
lines to dqrdc2.f I get the error:

write(*,300) ldx,n,p
300 format(3i4)

I don't get a compile error but I get the seemingly unrelated error in linking 
R.DLL
I guess the question now is, "How do I add a simple print statement?". Or, what 
is wrong with the above print statement?

Thank you.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread dkStevens

I could send the entire bit of code but I was hoping that someone would 
recognize the issue from past experience. I may be an artifact of other 
parts of the code. I observed the problem in a larger context and cut 
out all that you see below. The comments were next to the results to 
clarify. Apparently I failed to be clear.

Eik Vettorazzi [via R] wrote:
> Sorry, not reproducible. This works for me (as expected):
>
> branches<-c(5,6)
> iInd = 1
>   for(i in 1:length(branches)) {
> print((1:branches[i])+iInd-1)   # iInd is a position shift of the 
> index
> ni = branches[i]
> print(i)
> print(ni)
> print(c(ni,rep(i,times=ni)))
> # ... some interesting other stuff for my project that gets wrecked
> because of this issue
> iInd = iInd + branches[i]
> }
>
>
> dkStevens schrieb:
>
> > I'm observing odd behavior of the rep(...) procedure when using 
> variables as
> > parameters in a loop. Here's a simple loop on a vector 'branches' 
> that is
> > c(5,6,5,5,5). The statement in question is  
> > print(c(ni,rep(i,times=ni)))
> >
> > that works properly first time through the loop but the second time, 
> when
> > branches[2] = 6, only prints 5 values of i.
> >
> > Any ideas, anyone?
> >
> >   iInd = 1
> >   for(i in 1:length(branches)) {
> > print((1:branches[i])+iInd-1)   # iInd is a position shift of 
> the index
> > ni = branches[i]
> > print(i)
> > print(ni)
> > print(c(ni,rep(i,times=ni)))
> > # ... some interesting other stuff for my project that gets wrecked 
> because
> > of this issue
> > iInd = iInd + branches[i]
> > }
> >
> > # first pass through loop
> > [1] 1 2 3 4 5 # branches[1] + iInd - 1 content
> > [1] 1# i value to repeat 5 times
> > [1] 5# ni = 5 on 1st pass
> > [1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK
> >
> > # second pass through loop
> > [1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
> > [1] 2   # i value to repeat 6 times
> > [1] 6   # ni = 6 on 2nd pass
> > [1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos'
> > print(c(ni,rep(i,times=ni))) - why not 6?
> >
> >
> >  
>
> -- 
> Eik Vettorazzi
> Institut für Medizinische Biometrie und Epidemiologie
> Universitätsklinikum Hamburg-Eppendorf
>
> Martinistr. 52
> 20246 Hamburg
>
> T ++49/40/7410-58243
> F ++49/40/7410-57790
>
> __
> [hidden email] 
>  
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> 
> View message @ 
> http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560587.html 
>
> To unsubscribe from Funny result from rep(...) procedure, click here 
> < (link removed) =>. 
>
>

-- 
David K Stevens, P.E., Ph.D., Professor
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
david.stev...@usu.edu




-- 
View this message in context: 
http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560643.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread dkStevens

That's what I did. 'branches' was shown at the top.

branches = c(5,6,5,5,5)

I tested this. When I copy and paste into R 10.0 I get the result in the 
post. Perhaps I should reinstall R. I guess I don't see how much more 
narrow I can get than this.

  i iInd = 1
  for(i in 1:length(branches)) {
print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
ni = branches[i]
print(i)
print(ni)
print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked 
because of this issue
iInd = iInd + branches[i]
}



Erik Iverson-3 [via R] wrote:
> Cannot reproduce, what is branches?  If you can narrow it down to a
> "commented, minimal, self-contained, reproducible" example, you're far
> more likely to get help from the list.
>
> dkStevens wrote:
>
> > I'm observing odd behavior of the rep(...) procedure when using 
> variables as
> > parameters in a loop. Here's a simple loop on a vector 'branches' 
> that is
> > c(5,6,5,5,5). The statement in question is  
> > print(c(ni,rep(i,times=ni)))
> >
> > that works properly first time through the loop but the second time, 
> when
> > branches[2] = 6, only prints 5 values of i.
> >
> > Any ideas, anyone?
> >
> >   iInd = 1
> >   for(i in 1:length(branches)) {
> > print((1:branches[i])+iInd-1)   # iInd is a position shift of 
> the index
> > ni = branches[i]
> > print(i)
> > print(ni)
> > print(c(ni,rep(i,times=ni)))
> > # ... some interesting other stuff for my project that gets wrecked 
> because
> > of this issue
> > iInd = iInd + branches[i]
> > }
> >
> > # first pass through loop
> > [1] 1 2 3 4 5 # branches[1] + iInd - 1 content
> > [1] 1# i value to repeat 5 times
> > [1] 5# ni = 5 on 1st pass
> > [1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK
> >
> > # second pass through loop
> > [1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
> > [1] 2   # i value to repeat 6 times
> > [1] 6   # ni = 6 on 2nd pass
> > [1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos'
> > print(c(ni,rep(i,times=ni))) - why not 6?
> >
> >
>
> __
> [hidden email] 
>  
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> 
> View message @ 
> http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560566.html 
>
> To unsubscribe from Funny result from rep(...) procedure, click here 
> < (link removed) =>. 
>
>

-- 
David K Stevens, P.E., Ph.D., Professor
Civil and Environmental Engineering
Utah Water Research Laboratory
8200 Old Main Hill
Logan, UT  84322-8200
435 797 3229 - voice
435 797 1363 - fax
david.stev...@usu.edu




-- 
View this message in context: 
http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560641.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] svm regression

2010-02-18 Thread madhu sankar
Hi,

I am having trouble with svm regression.it is not giving the right results.

example

> model <- svm(dataTrain,classTrain,type="eps-regression")
> predict(model, dataTest)
36 37 38 39 40 41
42
-13.838257  -1.475401  10.502739  -3.047656  -8.713697   3.812873
1.741999
43 44 45 46 47 48
49
 -6.034361 -13.469742   7.628642 -22.197060  -3.417444  -8.536890
-11.876133
50
 -5.877457


My dataSet has 50 columns and 19 rows
my classSet has 50 columns and 1 row

My dataTrain has 35(1:35) columns and 19 rows
My classTrain has 35(1:35) columns and 1 row

My dataTest has 15(36:50) columns and 19 rows
My classTest has 15(36:50) columns and 1 row

My results should be as follows:

 [1] -25.70  30.30 -58.50  -1.12   7.62 -16.10 -48.50  21.10  12.60 -43.00
[11] -47.30 -47.90 -38.40 -21.30  22.40

But instead i get the wrong values.can anyone help me with it.

Thanks,
Joji.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error in using sample( )

2010-02-18 Thread Dennis Murphy
Works for me:

> sample(c(0,1,2),1,prob=c(0.2,0.3,0.5))
[1] 2


On Thu, Feb 18, 2010 at 7:48 AM,  wrote:

> Hi,
>
> I am using the command
>
> >sample(c(0,1,2),1,prob=c(0.2,0.3,0.5))
>
> and I have this error notification
>
> "Error in sample(c(0,1,2),1,prob=c(0.2,0.3,0.5)):
>  unused argument(s)(1,prob=c(0.2,0.3,0.5))
>
> I don't know what is going wrong. Please give me some suggestions.
>
> Thank you
>
>
> Best,
> Jing
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hmisc summary.formula.reverse

2010-02-18 Thread Frank E Harrell Jr

Abhijit Dasgupta wrote:

Hello,

Can summary.formula.reverse be customized to allow other summary 
statistics to be reported rather than the quartiles and mean +/- sd? The 


Not easily.  I'm not sure which other statistics would be descriptive 
however; certainly not the min and max or standard error.



"fun" option apparently doesn't apply when method='reverse'


Correct

Frank



Thanks




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Procedure not working for actual data

2010-02-18 Thread Bert Gunter
?traceback may be useful.

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of jim holtman
Sent: Thursday, February 18, 2010 10:17 AM
To: ROLL Josh F
Cc: r-help@r-project.org
Subject: Re: [R] Procedure not working for actual data

Even though it may work for a small subset, it can still break on
larger sets.  Your code was doing a number of 'unlist' and tearing
apart the data and it is possible that some of the transformations
were not aligned with the data in the way you thought them to be.
What you need to do in that case is break down what is happening and
look at the data in each substep to make sure it is what you are
expecting.

On Thu, Feb 18, 2010 at 11:50 AM, ROLL Josh F  wrote:
> Hey Jim,
>   That appears to work properly with my larger data set.  That's really
> strange to me though, why would my procedure not work even though the test
> works correctly?  I have always coded under the assumption that the code
> doesn't do anything the user doesn't tell it too but I cant see a problem
> with my code.
>
> I am looking at a similar problem with another piece of the code right now
> where everything looks right but it just isn't giving me the right output,
> although I haven't constructed a test yet.
>
> Thanks for the help.
>
> JR
>
> 
> From: jim holtman [mailto:jholt...@gmail.com]
> Sent: Wednesday, February 17, 2010 5:09 PM
> To: ROLL Josh F
> Cc: r-help@r-project.org
> Subject: Re: [R] Procedure not working for actual data
>
> Try this on your real data:
>
>> #Sample data
>> Bldgid<-c(1000,1000,1001,1002,1003,1003)
>> Maplot<-c(2,20001,3,30001,4,40001)
>> Area<-c(40,170,50,100,100,4.9)
>> #Construct Sample dataframe
>> MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
>> #Get Building Area Proportions
>> MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
> + FUN=function(x) x / sum(x))
>>
>> # find not too small
>> notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area >
>> 45) &
> + (MultiLotBldgs..$Prop < 0.05)))
>>
>> MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
>> # print out results
>> MultiLotBldgs2..
>   Bldgid Maplot Area  Prop
> 2   1000  20001  170 0.8095238
> 3   1001  3   50 1.000
> 4   1002  30001  100 1.000
> 5   1003  4  100 0.9532888
>>
>>
>>
>
>
> On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F  wrote:
>>
>> Sorry Just a generic list
>>
>> Is<-list()
>>
>> forgot to add that from my actual code
>> 
>> From: jim holtman [mailto:jholt...@gmail.com]
>> Sent: Wednesday, February 17, 2010 3:58 PM
>> To: ROLL Josh F
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Procedure not working for actual data
>>
>> Your example does not work since "Is" is not defined.  What is it
supposed
>> to be?
>>
>> On Wed, Feb 17, 2010 at 6:34 PM, LCOG1  wrote:
>>>
>>> Hello all,
>>>   I have what i feel is a unique situation which may not be resolved
with
>>> this inquiry.  I have constructed the below data set so that i may give
>>> an
>>> example of what im doing.  The example works perfectly and i have no
>>> issues
>>> with it.  My problem arises with my actual data, which includes another
>>> 11
>>> columns of data (used in later analysis) and a total of about 7000
>>> cases(rows).  i mention the dimensions of the actual data because im
>>> wondering if my below process would encounter problems with more data.
>>>  To be sure the problem occurs in the last step.  Is$NotTooSmall gives
me
>>> a
>>> binary output that is then put back in MultiLotBldgs.. (as shown in the
>>> example) to return the cases i want to keep.
>>>  In my actual data the binary designation is correct but when
>>> MultiLotBldgs2.. returns it doesnt remove the cases that are False in
>>> Is$NotTooSmall.  Like i said my sample data works fine but my actual
>>> implementation does not.  Any suggestions?  I know this is not easy to
>>> answer without seeing the problem but this is the best i can do without
>>> sending you all of my data.
>>>
>>> Cheers,
>>> JR
>>>
>>>
>>>
>>>
>>> #Sample data
>>> Bldgid<-c(1000,1000,1001,1002,1003,1003)
>>> Maplot<-c(2,20001,3,30001,4,40001)
>>> Area<-c(40,170,50,100,100,4.9)
>>> #Construct Sample dataframe
>>> MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
>>> #Get Building Areas
>>> MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
>>> MultiLotBldgs..$Bldgid,
>>>                              function(x) x))
>>>
>>> # Calculate the proportion of the total building area in each piece of
>>> the
>>> building
>>> MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
>>> MultiLotBldgs..$Bldgid,
>>>                              function(x) x/sum(x)))
>>>
>>> #Identify buildings that should be considered for joining
>>> Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
>>>                            ((MultiLotBldgArea.X > 45

Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread Henrik Bengtsson
On Thu, Feb 18, 2010 at 7:15 PM, Erik Iverson  wrote:
>
>
> Erik Iverson wrote:
>>
>> Cannot reproduce, what is branches?  If you can narrow it down to a
>> "commented, minimal, self-contained, reproducible" example, you're far more
>> likely to get help from the list.
>>
>
> My blinded guess though, is something to do with FAQ 7.31.

Probably, e.g.

branches<-c(5,6);
branches <- branches - 1e-12;

gives the reported output (as expected).

/H
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Bert Gunter
The key dates are 1938 and 1962. The FDC act of 1938 essentially mandated
(demonstration of) safety. The tox testing infrastructure grew from that.At
that time, there were no computers, little data, little statistics
methodology. Statistics played little role -- as is still mainly the case
today for safety. Any safety findings whatever in safety testing raise a
flag; statistical significance in the multiple testing framework is
irrelevant.

1962 saw the Kefauver-Harris Amendments that mandated demonstration of
efficacy. That was the key. The whole clinical trial framework and the
relevant statistical design and analysis infrastructure flowed from that
regulatory requirement. SAS's development soon after was therefore the first
direct response to the statistical software needs that resulted. Note also,
that statistical software was in its infancy at this time: before SAS there
was Fortran and COBOL; there was no statistical software.

So, as you can see, there essentially was **no** "before SAS". 

(Corrections/additional information welcome!)

Bert Gunter
Genentech Nonclinical Biostatistics
 
 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Christopher W. Ryan
Sent: Thursday, February 18, 2010 10:09 AM
To: r-help@r-project.org
Cc: p.dalga...@biostat.ku.dk
Subject: Re: [R] Use of R in clinical trials

Pure Food and Drug Act: 1906
FDA: 1930s
founding of SAS: early 1970s

(from the history websites of SAS and FDA)

What did pharmaceutical companies use for data analysis before there was 
SAS? And was there much angst over the change to SAS from whatever was 
in use before?

Or was there not such emphasis on and need for thorough data analysis 
back then?

--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu

"If you want to build a ship, don't drum up the men to gather wood, 
divide the work and give orders. Instead, teach them to yearn for the 
vast and endless sea."  [Antoine de St. Exupery]

Bert Gunter wrote:
> DISCLAIMER: This represents my personal view and in no way reflects that
of
> my company.
> 
> Warning: This is a long harangue that contains no useful information on R.
> May be wise to delete without reading. 
> --
> 
> Sorry folks, I still don't understand your comments. As Cody's original
post
> pointed out, there are a host of factors other than ease of
programmability
> or even quality of results that weigh against any change. To reiterate,
all
> companies have a huge infrastructure of **validated SAS code** that would
> have to be replaced. This, in itself, would take years and cost tens of
> millions of dollars at least. Also to reiterate, it's not only
> statistical/reporting functionality but even more the integration into the
> existing clinical database systems that would have to be rewritten **and
> validated**. All this would have to be done while continuing full steam on
> existing submissions. It is therefore not surprising to me that no pharma
> company in its right mind even contemplates undertaking such an effort.
> 
> To put these things into perspective. Let's say Pfizer has 200 SAS
> programmers (it's probably more, as they are a large Pharma, but I dunno).
> If each programmer costs, conservatively, $200K U.S. per year fully
loaded,
> that's $40 million U.S. for SAS Programmers. And this is probably a severe
> underestimate. So the $14M quoted below is chicken feed -- it doesn't even
> make the radar. 
> 
> To add further perspective, a single (large) pivotal clinical trial can
> easily cost $250M . A delay in approval due to fooling around trying to
> shift to a whole new software system could easily cause hundreds of
million
> to billions if it means a competitor gets to the marketplace first. So, to
> repeat, SAS costs are chicken feed.
> 
> Yes, I suppose that the present system institutionalizes mediocrity. How
> could it be otherwise in any such large scale enterprise? Continuity,
> reliability, and robustness are all orders of magnitude more important for
> both the FDA and Pharma to get safe and efficacious drugs to the public.
> Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate
> here!) would be dangerous, unacceptable, and would probably delay drug
> approvals. I consider this another example of the Kuhnsian paradigm
(Thomas
> Kuhn: "The Structure of Scientific Revolutions")in action.
> 
> This is **not** to say that there is not a useful role for R (or STATA or
> ...) to play in clinical trial submissions or, more generally, in drug
> research and development. There certainly is. For the record, I use R
> exclusively in my (nonclinical statistics) work. Nor is to say that all
> change must be avoided. That would be downright dangerous. But let's
please
> keep these issues in perspective. One's enthusiasm for R's manifold
virtues
> should not replace common sens

[R] error in using sample( )

2010-02-18 Thread jxia
Hi,

I am using the command

>sample(c(0,1,2),1,prob=c(0.2,0.3,0.5))

and I have this error notification

"Error in sample(c(0,1,2),1,prob=c(0.2,0.3,0.5)):
 unused argument(s)(1,prob=c(0.2,0.3,0.5))

I don't know what is going wrong. Please give me some suggestions.

Thank you


Best,
Jing

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot spatial location by using azimuth and distance?

2010-02-18 Thread Jian Zhang
Hey, I have one data like this:

   tree azimuth distance
   1 312  200
   2 322  201
   3 304  173
   4 294  154
   5 313   95

The "azimuth" stands for the azimuth of tree from plot center, and the 
"distance" is the distance of tree from plot center.
I want to plot the spatial location of these trees. I want to ask if there is 
an easy way to plot it by using the azimuth and the distance, not by 
transforming the data to x-y values.

Thanks so much.

  Jian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Procedure not working for actual data

2010-02-18 Thread ROLL Josh F
Hey Jim,
  That appears to work properly with my larger data set.  That's really strange 
to me though, why would my procedure not work even though the test works 
correctly?  I have always coded under the assumption that the code doesn't do 
anything the user doesn't tell it too but I cant see a problem with my code.

I am looking at a similar problem with another piece of the code right now 
where everything looks right but it just isn't giving me the right output, 
although I haven't constructed a test yet.

Thanks for the help.

JR



From: jim holtman [mailto:jholt...@gmail.com]
Sent: Wednesday, February 17, 2010 5:09 PM
To: ROLL Josh F
Cc: r-help@r-project.org
Subject: Re: [R] Procedure not working for actual data

Try this on your real data:

> #Sample data
> Bldgid<-c(1000,1000,1001,1002,1003,1003)
> Maplot<-c(2,20001,3,30001,4,40001)
> Area<-c(40,170,50,100,100,4.9)
> #Construct Sample dataframe
> MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
> #Get Building Area Proportions
> MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
+ FUN=function(x) x / sum(x))
>
> # find not too small
> notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area > 45) &
+ (MultiLotBldgs..$Prop < 0.05)))
>
> MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
> # print out results
> MultiLotBldgs2..
  Bldgid Maplot Area  Prop
2   1000  20001  170 0.8095238
3   1001  3   50 1.000
4   1002  30001  100 1.000
5   1003  4  100 0.9532888
>
>
>


On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F 
mailto:jr...@lcog.org>> wrote:
Sorry Just a generic list

Is<-list()

forgot to add that from my actual code

From: jim holtman [mailto:jholt...@gmail.com]
Sent: Wednesday, February 17, 2010 3:58 PM
To: ROLL Josh F
Cc: r-help@r-project.org
Subject: Re: [R] Procedure not working for actual data

Your example does not work since "Is" is not defined.  What is it supposed to 
be?

On Wed, Feb 17, 2010 at 6:34 PM, LCOG1 mailto:jr...@lcog.org>> 
wrote:

Hello all,
  I have what i feel is a unique situation which may not be resolved with
this inquiry.  I have constructed the below data set so that i may give an
example of what im doing.  The example works perfectly and i have no issues
with it.  My problem arises with my actual data, which includes another 11
columns of data (used in later analysis) and a total of about 7000
cases(rows).  i mention the dimensions of the actual data because im
wondering if my below process would encounter problems with more data.
 To be sure the problem occurs in the last step.  Is$NotTooSmall gives me a
binary output that is then put back in MultiLotBldgs.. (as shown in the
example) to return the cases i want to keep.
 In my actual data the binary designation is correct but when
MultiLotBldgs2.. returns it doesnt remove the cases that are False in
Is$NotTooSmall.  Like i said my sample data works fine but my actual
implementation does not.  Any suggestions?  I know this is not easy to
answer without seeing the problem but this is the best i can do without
sending you all of my data.

Cheers,
JR




#Sample data
Bldgid<-c(1000,1000,1001,1002,1003,1003)
Maplot<-c(2,20001,3,30001,4,40001)
Area<-c(40,170,50,100,100,4.9)
#Construct Sample dataframe
MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
#Get Building Areas
MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
MultiLotBldgs..$Bldgid,
 function(x) x))

# Calculate the proportion of the total building area in each piece of the
building
MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
MultiLotBldgs..$Bldgid,
 function(x) x/sum(x)))

#Identify buildings that should be considered for joining
Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
   ((MultiLotBldgArea.X > 45) & (MultiLotBldgProp.X
< 0.05

MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]

--
View this message in context: 
http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide ht

Re: [R] Checking the assumptions for a proper GLM model

2010-02-18 Thread Jay
Well, yes and no. Obviously I was not asking for the complete recap of
a all the theory on the subject. My main concern is finding readily
available CRAN functions and packages that would help me in the
process. I've found the UCLA site to be very informative and spent a
lot of time ther the last couple of days. However, their section on
using R for validating the assumptions is very lacking. Naturally
links like google.com and amazon.com will eventually get me there, but
if somebody have other recommendations, I would be very fortunate to
get even more help.

BR,
Jay

On Feb 18, 7:01 pm, David Winsemius  wrote:
> At one time the "answer" would have been to buy a copy of Venables and  
> Ripley's "Modern Applied Statistics with S" (and R), and that would  
> still be a sensible strategy. There are now quite a few other R-
> centric texts that have been published in the last few years. Search  
> Amazon if needed. You seem to be asking for a tutorial on general  
> linear modeling (which if you read the Posting Guide you will find is  
> not a service offered by the r-help list.)  Perhaps you should have  
> edited the link you provided in the obvious fashion:
>
> http://www.ats.ucla.edu/stat/R/
>
> Perhaps one of these pages:http://www.ats.ucla.edu/stat/R/dae/default.htm
>
> The UCLA Statistics website used to be dismissive of R, but they more  
> recently appear to have seen the light. There is also a great amount  
> of contributed teaching material on CRAN:
>
> http://cran.r-project.org/other-docs.html
>
> ... and more would be readily available via Googling with "r-project"  
> as part of a search strategy. Frank Harrell's material is in  
> particular quite useful:
>
> http://biostat.mc.vanderbilt.edu/wiki/Main/StatComp
>
> --
> David.
>
> On Feb 18, 2010, at 8:32 AM, Jay wrote:
>
>
>
>
>
> > So what I'm looking for is readily available tools/packages that could
> > produce some of the following:
>
> > 3.6 Summary of Useful Commands (STATA: Source:
> >http://www.ats.ucla.edu/stat/Stata/webbooks/logistic/chapter3/statalo...)
>
> >    * linktest--performs a link test for model specification, in our
> > case to check if logit is the right link function to use. This command
> > is issued after the logit or logistic command.
> 
> > and performs nonlinearity test.
>
> > But, since I'm new to GLM, I owuld greatly appreciate how you/others
> > go about and test the validity of a GLM model.
>
> > On Feb 18, 1:18 am, Jay  wrote:
> >> Hello,
>
> >> Are there any packages/functions available for testing the  
> >> assumptions
> >> underlying assumptions for a good GLM model? Like linktest in STATA
> >> and smilar. If not, could somebody please describe their work process
> >> when they check the validity of a logit/probit model?
>
> >> Regards,
> >> Jay
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] skip lapply item for conditions(RAM, execution time)

2010-02-18 Thread Joachim Harloff
Thank you,

but: 
- How do I create a condition object watching for memory overflow? Or, 
alternatively, excess time?
- How do I tell R to end the current lapply item, clean up memory and proceed 
to the next item? (I know I can write a wrapper function including gc() and 
direct lapply to that wrapper function. But what do I do to stop ONLY the 
original function, not the whole R process?)


Joachim

> ?tryCatch   
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
>  
>  
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
> On
> Behalf Of Joachim Harloff
> Sent: Wednesday, February 17, 2010 7:00 AM
> To: r-help@r-project.org
> Subject: [R] skip lapply item for conditions(RAM, execution time)
> 
> Hi,
> 
> I use lapply with a function from a package, and every 20th to 50th 
> function
> execution (out of 500 to 10 000 times) fails for some unknown reason. 
> RAM
> consumption ever increases (out of limit) and execution time is endless. 
> I
> am not going to debug the package. (I rather feel like an end-user.)
> 
> If it was Java, I'd wrap the function in a try...catch statement like 
> about
> 
> try{
> for(i=1;i<=1;i++){
> dofunction(i);
> }
> }
> catch(outofmemoryexception e){
> kill(dofunctioninstance);
> gc();
> next;
> }
> 
> 
> I did not find out how to construct this in R. Is anyone so kind to 
> point me
> to a documentation more exhaustive than help(conditions)? Or otherwise
> provide a snippet of R code?
> 
> Regards, Joachim
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Changed results in analyses run in sem and nlme ??

2010-02-18 Thread Claire Lay
 I'm uncertain how helpful it will be to give example code, but last week,
this model gave an error message to the tune of "failed to converge" after
about 5 minutes of run-time :

library(nlme)
model.A<- lme (fixed = avbranch~ wk*trt*pop
, random = ~wk|ID/fam/pop, data=branch)

 It seemed that failure to converge made sense, since there were many weeks
(wk) and values for "ID/fam/pop".  I settled for this model:

model.A2 <- lme (fixed = avbranch~ wk*trt*pop
, random = ~1|ID/fam/pop, data=branch)

However, when I tried the model.A on a different dependent variable this
week, it converged.  Since the challenge to convergence (many levels of wk
and ID/fam/pop) was the same as it had been before, I went back and tried
model.A on the other analysis, and it also ran.  I then started checking
results for everything I'd done in the past three weeks in packages that use
ML methods (FIML, REML)--and got different outcomes.  I've quadruple-checked
to be sure I'm using the same code and the same data (I use .csv files for
simplicity), and see no differences.  However, results from nlme and sem
packages are both different.  I had not saved detailed output, but had
recorded parameters, model-fit statistics, and convergence failures.

Could some new package I installed could have changed the way that MLE
methods are functioning in the work environment?  Everything in the search
path looks as it did before, but could something like Rcmdr (just installed,
but not now in the search path) change other parts of the environment?

Is that possible?  If so, how do I check?  If not, could anything (besides
user error--obviously the simplest solutions are that the code or the data
really is different or I recorded the results incorrectly before) account
for different results using the same data and code?

My main concern is about which results are correct: convergence seems to be
happening much faster now, and some models are showing better fits, but the
difference makes me nervous.

Any ideas would be helpful.

Thanks,

Claire Lay

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D plot

2010-02-18 Thread David A.G

Dearl list,

can anyone point me to a function or library that can create a graph similar to 
the one in the following powerpoint presentation?

http://bmi.osu.edu/~khuang/IBGP705/BMI705-Lecture7.ppt

(pages 36-37)

In order to try to explain the graph, the way I see it in R terms is something 
like this:

the "p-q" axis is a vector of positions (for example, seq(0,500,1))
the "Chr1-Chrx" is a vector of units, in this case chromosomes (so something 
like seq(1,10,1))
the plotted data is observations for each unit at each position

I guess the fancy gradient on the highest peaks is tougher to get (knowing I am 
not an R expert), but just plain blue would suffice.

I have checked some of the graphs in the R graph gallery but I don´t think any 
of them would work

Thanks in advance,

Dave
  
_
Hotmail: Trusted email with powerful SPAM protection.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] survey package question

2010-02-18 Thread Richard Valliant
Should the svyby function be able to work with svyquantile?  I get the
error below ...

data(api)
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
 
svyby(~api00, 
  design=dclus1,
  by = ~stype, 
  quantiles=c(.25,.5,.75), 
  FUN=svyquantile, 
  na.rm=T 
  )
 
> Error in object$coefficients : $ operator is invalid for atomic
vectors

A more general question is: can quantiles and their SEs be computed for
subgroups?
 
The same syntax works using svymean.

Thanks
R. Valliant
Univ. of Maryland, US

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread Eik Vettorazzi

Sorry, not reproducible. This works for me (as expected):

branches<-c(5,6)
iInd = 1
 for(i in 1:length(branches)) {
   print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
   ni = branches[i]
   print(i)
   print(ni)
   print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked 
because of this issue

   iInd = iInd + branches[i]
}


dkStevens schrieb:

I'm observing odd behavior of the rep(...) procedure when using variables as
parameters in a loop. Here's a simple loop on a vector 'branches' that is
c(5,6,5,5,5). The statement in question is   
print(c(ni,rep(i,times=ni)))


that works properly first time through the loop but the second time, when
branches[2] = 6, only prints 5 values of i.

Any ideas, anyone?

  iInd = 1
  for(i in 1:length(branches)) {
print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
ni = branches[i]
print(i)
print(ni)
print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked because
of this issue
iInd = iInd + branches[i]
}

# first pass through loop
[1] 1 2 3 4 5 # branches[1] + iInd - 1 content
[1] 1# i value to repeat 5 times
[1] 5# ni = 5 on 1st pass
[1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK

# second pass through loop
[1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
[1] 2   # i value to repeat 6 times
[1] 6   # ni = 6 on 2nd pass
[1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos' 
print(c(ni,rep(i,times=ni))) - why not 6?



  


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Procedure not working for actual data

2010-02-18 Thread jim holtman
Even though it may work for a small subset, it can still break on
larger sets.  Your code was doing a number of 'unlist' and tearing
apart the data and it is possible that some of the transformations
were not aligned with the data in the way you thought them to be.
What you need to do in that case is break down what is happening and
look at the data in each substep to make sure it is what you are
expecting.

On Thu, Feb 18, 2010 at 11:50 AM, ROLL Josh F  wrote:
> Hey Jim,
>   That appears to work properly with my larger data set.  That's really
> strange to me though, why would my procedure not work even though the test
> works correctly?  I have always coded under the assumption that the code
> doesn't do anything the user doesn't tell it too but I cant see a problem
> with my code.
>
> I am looking at a similar problem with another piece of the code right now
> where everything looks right but it just isn't giving me the right output,
> although I haven't constructed a test yet.
>
> Thanks for the help.
>
> JR
>
> 
> From: jim holtman [mailto:jholt...@gmail.com]
> Sent: Wednesday, February 17, 2010 5:09 PM
> To: ROLL Josh F
> Cc: r-help@r-project.org
> Subject: Re: [R] Procedure not working for actual data
>
> Try this on your real data:
>
>> #Sample data
>> Bldgid<-c(1000,1000,1001,1002,1003,1003)
>> Maplot<-c(2,20001,3,30001,4,40001)
>> Area<-c(40,170,50,100,100,4.9)
>> #Construct Sample dataframe
>> MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
>> #Get Building Area Proportions
>> MultiLotBldgs..$Prop <- ave(MultiLotBldgs..$Area, MultiLotBldgs..$Bldgid,
> + FUN=function(x) x / sum(x))
>>
>> # find not too small
>> notTooSmall <- !((MultiLotBldgs..$Area <= 45) | ((MultiLotBldgs..$Area >
>> 45) &
> + (MultiLotBldgs..$Prop < 0.05)))
>>
>> MultiLotBldgs2.. <- MultiLotBldgs..[notTooSmall,]
>> # print out results
>> MultiLotBldgs2..
>   Bldgid Maplot Area  Prop
> 2   1000  20001  170 0.8095238
> 3   1001  3   50 1.000
> 4   1002  30001  100 1.000
> 5   1003  4  100 0.9532888
>>
>>
>>
>
>
> On Wed, Feb 17, 2010 at 6:58 PM, ROLL Josh F  wrote:
>>
>> Sorry Just a generic list
>>
>> Is<-list()
>>
>> forgot to add that from my actual code
>> 
>> From: jim holtman [mailto:jholt...@gmail.com]
>> Sent: Wednesday, February 17, 2010 3:58 PM
>> To: ROLL Josh F
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Procedure not working for actual data
>>
>> Your example does not work since "Is" is not defined.  What is it supposed
>> to be?
>>
>> On Wed, Feb 17, 2010 at 6:34 PM, LCOG1  wrote:
>>>
>>> Hello all,
>>>   I have what i feel is a unique situation which may not be resolved with
>>> this inquiry.  I have constructed the below data set so that i may give
>>> an
>>> example of what im doing.  The example works perfectly and i have no
>>> issues
>>> with it.  My problem arises with my actual data, which includes another
>>> 11
>>> columns of data (used in later analysis) and a total of about 7000
>>> cases(rows).  i mention the dimensions of the actual data because im
>>> wondering if my below process would encounter problems with more data.
>>>  To be sure the problem occurs in the last step.  Is$NotTooSmall gives me
>>> a
>>> binary output that is then put back in MultiLotBldgs.. (as shown in the
>>> example) to return the cases i want to keep.
>>>  In my actual data the binary designation is correct but when
>>> MultiLotBldgs2.. returns it doesnt remove the cases that are False in
>>> Is$NotTooSmall.  Like i said my sample data works fine but my actual
>>> implementation does not.  Any suggestions?  I know this is not easy to
>>> answer without seeing the problem but this is the best i can do without
>>> sending you all of my data.
>>>
>>> Cheers,
>>> JR
>>>
>>>
>>>
>>>
>>> #Sample data
>>> Bldgid<-c(1000,1000,1001,1002,1003,1003)
>>> Maplot<-c(2,20001,3,30001,4,40001)
>>> Area<-c(40,170,50,100,100,4.9)
>>> #Construct Sample dataframe
>>> MultiLotBldgs..<-data.frame(Bldgid,Maplot,Area)
>>> #Get Building Areas
>>> MultiLotBldgArea.X <- unlist(tapply(MultiLotBldgs..$Area,
>>> MultiLotBldgs..$Bldgid,
>>>                              function(x) x))
>>>
>>> # Calculate the proportion of the total building area in each piece of
>>> the
>>> building
>>> MultiLotBldgProp.X <- unlist(tapply(MultiLotBldgs..$Area,
>>> MultiLotBldgs..$Bldgid,
>>>                              function(x) x/sum(x)))
>>>
>>> #Identify buildings that should be considered for joining
>>> Is$NotTooSmall.X <- !(((MultiLotBldgArea.X <= 45) |
>>>                            ((MultiLotBldgArea.X > 45) &
>>> (MultiLotBldgProp.X
>>> < 0.05
>>>
>>> MultiLotBldgs2.. <- MultiLotBldgs..[Is$NotTooSmall.X, ]
>>>
>>> --
>>> View this message in context:
>>> http://n4.nabble.com/Procedure-not-working-for-actual-data-tp1559492p1559492.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> __

Re: [R] Extract p-value from aftreg object

2010-02-18 Thread David Winsemius


On Feb 18, 2010, at 12:33 PM, David Winsemius wrote:

I don't seem to have the unnamed package loaded that has aftreg(),  
but in general you ought to be able to get what you want by looking  
not just at the aftreg-object but also at the print(aftreg)-object  
using str().


Trying that approach after loading package eha shows it will not work,  
and looking at the print method shows that no object is returned.


--
David.



On Feb 18, 2010, at 11:07 AM, Philipp Rappold wrote:


Dear all,

does anyone know how I can extract specific p-values for covariates  
from an aftreg object? After fitting a model with aftreg I can find  
all different variables by using str(), but there's no place where  
p-values are kept. The odd thing is that print() displays them  
correctly.


EXAMPLE:

> testdata
start stop censor groupvar   var1  var2
1 01  01 0.91663902 0.0847912
2 12  01 0.60470753 0.6487798
3 23  01 0.09599891 0.2195178
4 34  11 0.86384189 0.6667897
5 01  02 0.07747445 0.8782836
6 12  02 0.44608030 0.2218685
7 23  12 0.77317152 0.3813840


> fit1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata)


> fit1
Call:
aftreg(formula = Surv(start, stop, censor) ~ var1, data = testdata)

Covariate  W.mean  Coef Exp(Coef)  se(Coef)Wald p
var10.540 0.150 1.162 0.770 0.845

log(scale)1.358 3.890 0.260 0.000
log(shape)2.015 7.502 0.781 0.010

Events2
Total time at risk 7
Max. log. likelihood  -1.4026
LR test statistic 0.05
Degrees of freedom1
Overall p-value   0.816436
WALD P IS DISPLAYED CORRECTLY.


Any help is highly appreciated, I'm going nuts here ;)

Thanks
Philipp

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread Erik Iverson



Erik Iverson wrote:
Cannot reproduce, what is branches?  If you can narrow it down to a 
"commented, minimal, self-contained, reproducible" example, you're far 
more likely to get help from the list.




My blinded guess though, is something to do with FAQ 7.31.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Funny result from rep(...) procedure

2010-02-18 Thread Erik Iverson
Cannot reproduce, what is branches?  If you can narrow it down to a 
"commented, minimal, self-contained, reproducible" example, you're far 
more likely to get help from the list.


dkStevens wrote:

I'm observing odd behavior of the rep(...) procedure when using variables as
parameters in a loop. Here's a simple loop on a vector 'branches' that is
c(5,6,5,5,5). The statement in question is   
print(c(ni,rep(i,times=ni)))


that works properly first time through the loop but the second time, when
branches[2] = 6, only prints 5 values of i.

Any ideas, anyone?

  iInd = 1
  for(i in 1:length(branches)) {
print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
ni = branches[i]
print(i)
print(ni)
print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked because
of this issue
iInd = iInd + branches[i]
}

# first pass through loop
[1] 1 2 3 4 5 # branches[1] + iInd - 1 content
[1] 1# i value to repeat 5 times
[1] 5# ni = 5 on 1st pass
[1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK

# second pass through loop
[1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
[1] 2   # i value to repeat 6 times
[1] 6   # ni = 6 on 2nd pass
[1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos' 
print(c(ni,rep(i,times=ni))) - why not 6?





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Christopher W. Ryan

Pure Food and Drug Act: 1906
FDA: 1930s
founding of SAS: early 1970s

(from the history websites of SAS and FDA)

What did pharmaceutical companies use for data analysis before there was 
SAS? And was there much angst over the change to SAS from whatever was 
in use before?


Or was there not such emphasis on and need for thorough data analysis 
back then?


--Chris
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
425 Robinson Street, Binghamton, NY  13904
cryanatbinghamtondotedu

"If you want to build a ship, don't drum up the men to gather wood, 
divide the work and give orders. Instead, teach them to yearn for the 
vast and endless sea."  [Antoine de St. Exupery]


Bert Gunter wrote:

DISCLAIMER: This represents my personal view and in no way reflects that of
my company.

Warning: This is a long harangue that contains no useful information on R.
May be wise to delete without reading. 
--


Sorry folks, I still don't understand your comments. As Cody's original post
pointed out, there are a host of factors other than ease of programmability
or even quality of results that weigh against any change. To reiterate, all
companies have a huge infrastructure of **validated SAS code** that would
have to be replaced. This, in itself, would take years and cost tens of
millions of dollars at least. Also to reiterate, it's not only
statistical/reporting functionality but even more the integration into the
existing clinical database systems that would have to be rewritten **and
validated**. All this would have to be done while continuing full steam on
existing submissions. It is therefore not surprising to me that no pharma
company in its right mind even contemplates undertaking such an effort.

To put these things into perspective. Let's say Pfizer has 200 SAS
programmers (it's probably more, as they are a large Pharma, but I dunno).
If each programmer costs, conservatively, $200K U.S. per year fully loaded,
that's $40 million U.S. for SAS Programmers. And this is probably a severe
underestimate. So the $14M quoted below is chicken feed -- it doesn't even
make the radar. 


To add further perspective, a single (large) pivotal clinical trial can
easily cost $250M . A delay in approval due to fooling around trying to
shift to a whole new software system could easily cause hundreds of million
to billions if it means a competitor gets to the marketplace first. So, to
repeat, SAS costs are chicken feed.

Yes, I suppose that the present system institutionalizes mediocrity. How
could it be otherwise in any such large scale enterprise? Continuity,
reliability, and robustness are all orders of magnitude more important for
both the FDA and Pharma to get safe and efficacious drugs to the public.
Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate
here!) would be dangerous, unacceptable, and would probably delay drug
approvals. I consider this another example of the Kuhnsian paradigm (Thomas
Kuhn: "The Structure of Scientific Revolutions")in action.

This is **not** to say that there is not a useful role for R (or STATA or
...) to play in clinical trial submissions or, more generally, in drug
research and development. There certainly is. For the record, I use R
exclusively in my (nonclinical statistics) work. Nor is to say that all
change must be avoided. That would be downright dangerous. But let's please
keep these issues in perspective. One's enthusiasm for R's manifold virtues
should not replace common sense and logic. That, too, would be unfortunate.

Since I've freely blustered, I am now a fair target. So I welcome forceful
rebuttals and criticisms and, as I've said what I wanted to, I will not
respond. You have the last word.

Bert Gunter
Genentech Nonclinical Biostatistics


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an error about " return some vectors from some functions within a function"

2010-02-18 Thread jim holtman
Might be the case that the 'while' loop was not executed and therefore
'ck1' was not defined.  You might put a check to see if that was
happening.  Also you are incrementing 'count1' in the functions and it
is not being passed in as a parameter.  What are you expecting it to
do?  Is it defined in the global workspace?  The incrementing of
'count1' is local to the function.  Are you expecting it to count the
number of times things happen.  If so, you need to do:

count1 <<- count1 + 1

but be advised "<<-" is not for novices if you do not understand the
scoping rules of R.

On Thu, Feb 18, 2010 at 12:05 PM, Nai-Wei Chen  wrote:
> Dear all,
>
> When I try to return some vectors from some functions within a function, it 
> indicate an error," Error in rbind(ck1, ck2, ck3) : object 'ck1' not found", 
> in one of the iterations and stop.  Since I am not experienced in 
> programming, can anyone give me a suggestion to inspect this error?
> The followings are the functions I created :
>
> ###
> # functions in the convg #
> ###
> check1 <- function(sumgt,beta1.0,gamma.0,sigma.0){
>    if (any(!is.finite(sumgt))){
>    count1 <- count1+1
>    return(c(count1,beta1.0,gamma.0,sigma.0))
>    }
>   else {return(c(NaN,NaN,NaN,NaN))}
>  }
> check2 <- function(v0,maxit,iter,beta1.0,gamma.0,sigma.0){
>     if (is.nan(sum(v0))==TRUE | any(!is.finite(v0)) | maxit == iter){
>    count1 <- count1+1
>    return(c(count1,beta1.0,gamma.0,sigma.0))
>    }
>    else {return(c(NaN,NaN,NaN,NaN))}
>  }
> check3 <- function(maxit,diff,error,beta1.0,gamma.0,sigma.0){
>     if (diff < error) {
>    return(c(count,beta1.0,gamma.0,sigma.0))
>    }
>    else {return(c(NaN,NaN,NaN,NaN))}
>  }
>
> convg <- 
> function(count1,count,sub,rep,n,data1,beta1.0,gamma.0,sigma.0,v0,L,diff,error,iter,maxit){
>
>  while(diff > error && maxit < iter && max(abs(c(beta1.0,gamma.0,sigma.0))) < 
> 10 && is.nan(sum(v0))==FALSE && any(is.finite(v0))){
>    :
>    :
>  ck1 <- check1(sumgt,beta1.0,gamma.0,sigma.0)
>    :
>    :
>  ck2 <- check2(v0,maxit,iter,beta1.0,gamma.0,sigma.0)
>    :
>    :
>  ck3 <- check3(maxit,diff,error,beta1.0,gamma.0,sigma.0)
>
> }
> return(rbind(ck1,ck2,ck3))
> }
>
> Thank you so much
>
>
> Sincerely,
>
>
> Joe
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hmisc summary.formula.reverse

2010-02-18 Thread Abhijit Dasgupta

Hello,

Can summary.formula.reverse be customized to allow other summary 
statistics to be reported rather than the quartiles and mean +/- sd? The 
"fun" option apparently doesn't apply when method='reverse'


Thanks
--

Abhijit Dasgupta, PhD

Statistician | Clinical Sciences Section | NIAMS/NIH


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Funny result from rep(...) procedure

2010-02-18 Thread dkStevens

I'm observing odd behavior of the rep(...) procedure when using variables as
parameters in a loop. Here's a simple loop on a vector 'branches' that is
c(5,6,5,5,5). The statement in question is   
print(c(ni,rep(i,times=ni)))

that works properly first time through the loop but the second time, when
branches[2] = 6, only prints 5 values of i.

Any ideas, anyone?

  iInd = 1
  for(i in 1:length(branches)) {
print((1:branches[i])+iInd-1)   # iInd is a position shift of the index
ni = branches[i]
print(i)
print(ni)
print(c(ni,rep(i,times=ni)))
# ... some interesting other stuff for my project that gets wrecked because
of this issue
iInd = iInd + branches[i]
}

# first pass through loop
[1] 1 2 3 4 5 # branches[1] + iInd - 1 content
[1] 1# i value to repeat 5 times
[1] 5# ni = 5 on 1st pass
[1] 5 1 1 1 1 1   # five values - 1 1 1 1 1 - OK

# second pass through loop
[1]  6  7  8  9 10 11   # branches[2] + iInd - 1 content
[1] 2   # i value to repeat 6 times
[1] 6   # ni = 6 on 2nd pass
[1] 6 2 2 2 2 2  # 6 'twos' but only shows 5 'twos' 
print(c(ni,rep(i,times=ni))) - why not 6?


-- 
View this message in context: 
http://n4.nabble.com/Funny-result-from-rep-procedure-tp1560547p1560547.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset() for multiple values

2010-02-18 Thread Erik Iverson

subset(df, x %in% c(...))

chipmaney wrote:

This code works:

subset(NativeDominant.df,!ID=="37-R17")


This code does not:

Tree.df<-subset(NativeDominant.df,!ID==c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16"))



how do i get subset() to work on a range of values?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset() for multiple values

2010-02-18 Thread Henrique Dallazuanna
Use ' %in%':

Tree.df<-subset(NativeDominant.df,!ID %in%
c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16"))

On Thu, Feb 18, 2010 at 3:59 PM, chipmaney  wrote:
>
> This code works:
>
> subset(NativeDominant.df,!ID=="37-R17")
>
>
> This code does not:
>
> Tree.df<-subset(NativeDominant.df,!ID==c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16"))
>
>
>
> how do i get subset() to work on a range of values?
> --
> View this message in context: 
> http://n4.nabble.com/subset-for-multiple-values-tp1560543p1560543.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset() for multiple values

2010-02-18 Thread chipmaney

This code works:

subset(NativeDominant.df,!ID=="37-R17")


This code does not:

Tree.df<-subset(NativeDominant.df,!ID==c("37-R17","37-R18","10-R1","37-R21","37-R24","R7A-R1","3-R1","37-R16"))



how do i get subset() to work on a range of values?
-- 
View this message in context: 
http://n4.nabble.com/subset-for-multiple-values-tp1560543p1560543.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract p-value from aftreg object

2010-02-18 Thread Göran Broström
2010/2/18 Philipp Rappold :
> Dear all,
>
> does anyone know how I can extract specific p-values for covariates from an
> aftreg object? After fitting a model with aftreg I can find all different
> variables by using str(), but there's no place where p-values are kept. The
> odd thing is that print() displays them correctly.
>
> EXAMPLE:
>
>> testdata
>  start stop censor groupvar       var1      var2
> 1     0    1      0        1 0.91663902 0.0847912
> 2     1    2      0        1 0.60470753 0.6487798
> 3     2    3      0        1 0.09599891 0.2195178
> 4     3    4      1        1 0.86384189 0.6667897
> 5     0    1      0        2 0.07747445 0.8782836
> 6     1    2      0        2 0.44608030 0.2218685
> 7     2    3      1        2 0.77317152 0.3813840
>
>
>> fit1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata)
>
>
>> fit1
> Call:
> aftreg(formula = Surv(start, stop, censor) ~ var1, data = testdata)
>
> Covariate          W.mean      Coef Exp(Coef)  se(Coef)    Wald p
> var1                0.540     0.150     1.162     0.770     0.845
>
> log(scale)                    1.358     3.890     0.260     0.000
> log(shape)                    2.015     7.502     0.781     0.010
>
> Events                    2
> Total time at risk             7
> Max. log. likelihood      -1.4026
> LR test statistic         0.05
> Degrees of freedom        1
> Overall p-value           0.816436
>        WALD P IS DISPLAYED CORRECTLY.
>
>
> Any help is highly appreciated, I'm going nuts here ;)

print.aftreg calculates the p-values like this:

pchisq(x$coef^2 / diag(x$var), 1, lower.tail = FALSE)

where  x  is the output from 'aftreg'. You could try the same.


Thanks
> Philipp
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Göran Broström

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AFTREG with ID argument

2010-02-18 Thread Göran Broström
2010/2/18 Philipp Rappold :
> Göran, David,
>
> in order to adapt aftreg to my needs I wrote a little function that I would
> like to share with you and the community.

I once promised to fix this 'asap'. Now I promise to do it tonight. OK?

Göran

>
>
> WHAT DOES IT FIX?
>
> (1) Using the id-argument in combination with missing values on covariates
> wasn't easily possible before because the id-dataframe and the
> data-dataframe had different sizes and aftreg quitted with an error. My fix
> makes sure that NAs are excluded from both dataframes and aftreg will run
> without error here.
>
> (2) The id-argument was required to be specified by its "absolute path" (eg.
> id=testdata$groupvar, see below in this thread). My adapted funtion takes
> the name of the id-variable as a string, eg. id="groupvar".
>
>
> HOW DOES IT WORK?
>
> Use function aftreg2 just like you would use aftreg. Mandatory arguments
> are: formula, data and id, where id is a string variable. Example:
>
>> testdata
>  start stop censor groupvar      var1
> 1     0    1      0        1 0.1284928
> 2     1    2      0        1 0.4896125
> 3     2    3      0        1 0.7012899
> 4     3    4      0        1        NA
> 5     0    1      0        2 0.7964361
> 6     1    2      0        2 0.8466039
> 7     2    3      1        2 0.2234271
>
> model1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata, id=groupvar)
>> ERROR.
>
> model2 <- aftreg2(Surv(start, stop, censor)~var1, data=testdata,
> id="groupvar")
>> WORKS FINE.
>
>
> PREREQUISITES:
>
> (1) Make sure that missing values are only present at the end of a lifetime.
> The regression will yield false results if you have missing covariate data
> in the middle of a lifetime. For instance: known covariates from liftetime
> 0-10, 13-20, but not from 11-12. (Göran: Please correct me if I'm wrong
> here!).
>
> (2) If you have missing covariate data at the beginning of a lifetime (eg.
> missing from 0-5, but present from 6-censoring), this fix will yield false
> results if one _cannot_ assume that the missing covariates were the same
> from 0-5 as they were at 6. (Göran: Please correct me again if I'm wrong
> here with my interpretation, but that's basically what you said before)
>
>
> LISTING:
>
> aftreg2 <- function(formula, data, id, ...){
>
>        call <- match.call()
>
>        non_na_cols <- attr(attr(terms(formula), "factors"),
> "dimnames")[2][[1]]
>
>        data <- data[complete.cases(data[non_na_cols]),]
>        data <- data[complete.cases(data[id]),]
>
>        cat("Original Call: ")
>        print(call)
>
>        return(aftreg(formula=formula, data=data, id=data[,id], ...))
> }
>
>
> Hope someone finds this interesting.
>
> All the best
> Philipp
>
>
> David Winsemius wrote:
>>
>> On Feb 11, 2010, at 5:58 AM, Philipp Rappold wrote:
>>
>>> Göran, thanks!
>>>
>>> One more thing that I found: As soon as you have at least one NA in the
>>> independent vars, the trick that you mentioned does not work anymore.
>>> Example:
>>>
>>> > testdata
>>>  start stop censor groupvar      var1
>>> 1     0    1      0        1 0.1284928
>>> 2     1    2      0        1 0.4896125
>>> 3     2    3      0        1 0.7012899
>>> 4     3    4      0        1        NA
>>> 5     0    1      0        2 0.7964361
>>> 6     1    2      0        2 0.8466039
>>> 7     2    3      1        2 0.2234271
>>>
>>> > aftreg(Surv(start, stop, censor)~var1, data=testdata,
>>> > id=testdata$groupvar)
>>> Error in order(id, Y[, 1]) : Different length of arguments (* I
>>> translated this from the German Output *)
>>>
>>> Do you think there is a simple hack which excludes all subjects that have
>>> at least on NA in their independent vars? If it was only one dependent var
>>> it would probably be easy by just using subset, but I have lots of different
>>> combinations of vars that I'd like to test ;)
>>>
>>
>> I don't know if it's a "hack", but there are a set of functions that
>> perform such subsetting:
>>
>> ?na.omit
>>
>> There is a parameter that would accomplish that goal inside aftreg. You
>> may want to check what your defaults are for na.action.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract p-value from aftreg object

2010-02-18 Thread David Winsemius
I don't seem to have the unnamed package loaded that has aftreg(), but  
in general you ought to be able to get what you want by looking not  
just at the aftreg-object but also at the print(aftreg)-object using  
str().



On Feb 18, 2010, at 11:07 AM, Philipp Rappold wrote:


Dear all,

does anyone know how I can extract specific p-values for covariates  
from an aftreg object? After fitting a model with aftreg I can find  
all different variables by using str(), but there's no place where p- 
values are kept. The odd thing is that print() displays them  
correctly.


EXAMPLE:

> testdata
 start stop censor groupvar   var1  var2
1 01  01 0.91663902 0.0847912
2 12  01 0.60470753 0.6487798
3 23  01 0.09599891 0.2195178
4 34  11 0.86384189 0.6667897
5 01  02 0.07747445 0.8782836
6 12  02 0.44608030 0.2218685
7 23  12 0.77317152 0.3813840


> fit1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata)


> fit1
Call:
aftreg(formula = Surv(start, stop, censor) ~ var1, data = testdata)

Covariate  W.mean  Coef Exp(Coef)  se(Coef)Wald p
var10.540 0.150 1.162 0.770 0.845

log(scale)1.358 3.890 0.260 0.000
log(shape)2.015 7.502 0.781 0.010

Events2
Total time at risk 7
Max. log. likelihood  -1.4026
LR test statistic 0.05
Degrees of freedom1
Overall p-value   0.816436
WALD P IS DISPLAYED CORRECTLY.


Any help is highly appreciated, I'm going nuts here ;)

Thanks
Philipp

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] an error about " return some vectors from some functions within a function"

2010-02-18 Thread Nai-Wei Chen
Dear all,

When I try to return some vectors from some functions within a function, it 
indicate an error," Error in rbind(ck1, ck2, ck3) : object 'ck1' not found", 
in one of the iterations and stop.  Since I am not experienced in programming, 
can anyone give me a suggestion to inspect this error?
The followings are the functions I created :

###
# functions in the convg #
###
check1 <- function(sumgt,beta1.0,gamma.0,sigma.0){
   if (any(!is.finite(sumgt))){
       count1 <- count1+1
       return(c(count1,beta1.0,gamma.0,sigma.0))
   }
  else {return(c(NaN,NaN,NaN,NaN))}
 }
check2 <- function(v0,maxit,iter,beta1.0,gamma.0,sigma.0){
    if (is.nan(sum(v0))==TRUE | any(!is.finite(v0)) | maxit == iter){
       count1 <- count1+1
       return(c(count1,beta1.0,gamma.0,sigma.0)) 
   } 
   else {return(c(NaN,NaN,NaN,NaN))}
 }
check3 <- function(maxit,diff,error,beta1.0,gamma.0,sigma.0){
    if (diff < error) { 
       return(c(count,beta1.0,gamma.0,sigma.0))           
   }
   else {return(c(NaN,NaN,NaN,NaN))}    
 }

convg <- 
function(count1,count,sub,rep,n,data1,beta1.0,gamma.0,sigma.0,v0,L,diff,error,iter,maxit){
  
 while(diff > error && maxit < iter && max(abs(c(beta1.0,gamma.0,sigma.0))) < 
10 && is.nan(sum(v0))==FALSE && any(is.finite(v0))){
           :
           :
 ck1 <- check1(sumgt,beta1.0,gamma.0,sigma.0)
           : 
           :
 ck2 <- check2(v0,maxit,iter,beta1.0,gamma.0,sigma.0)
           :
           :
 ck3 <- check3(maxit,diff,error,beta1.0,gamma.0,sigma.0)
 
} 
return(rbind(ck1,ck2,ck3))
}
 
Thank you so much


Sincerely,


Joe


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with multiple plots (mfrow, mar)

2010-02-18 Thread Peter Neuhaus

Dear R-users,

I often stack plots that have the same x-axis. To save space and have
the plots themselves as large as possible I like to minimize the margins
between the plots to zero. I use the "mfrow" and "mar" parameters to
achieve this.

However, the different margin settings for the individual plots lead to
the inner plots being higher than the two outer plots. To make the
data in the individual subplots visually comparable, I would like
to have all plots with a plotting area of exactly the same height.

How would that be done in R?

Here's some example code to illustrate my problem:

# BEGIN
x <- 1:10
a <- runif(10, 0, 5)
b <- runif(10, 0, 5)
c <- runif(10, 0, 5)

ylim <- c(0, 5)

par(mfrow=c(3,1))
par(mar=c(0,4.1,2.1,3.1))

plot(x, a, type="o", ylim=ylim, axes=FALSE)
axis(1, labels=FALSE)
axis(2)
axis(3)
axis(4)
box()

par(mar=c(0,4.1,0,3.1))

plot(x, b, type="o", ylim=ylim, axes=FALSE)
axis(1, labels=FALSE)
axis(2)
axis(3, labels=FALSE)
axis(4)
box()

par(mar=c(2.1,4.1,0,3.1))

plot(x, c, type="o", ylim=ylim, axes=FALSE)
axis(1)
axis(2)
axis(3, labels=FALSE)
axis(4)
box()
# END

Thanks in advance,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of R in clinical trials

2010-02-18 Thread Frank E Harrell Jr

Bert,

I have to disagree with just part of what you said.  The ultimate 
savings by using R is astronomical.  Up front it would definitely cost 
more, as you so eloquently stated.  So it boils down to short-term vs. 
long-term thinking.


More importantly, the statistical/graphical reports created using R are 
more informative and lead to better reviews and more clear assessment of 
pharmaceutical safety problems by not relying on reams of tables that 
put reviewers to sleep.


Frank


Bert Gunter wrote:

DISCLAIMER: This represents my personal view and in no way reflects that of
my company.

Warning: This is a long harangue that contains no useful information on R.
May be wise to delete without reading. 
--


Sorry folks, I still don't understand your comments. As Cody's original post
pointed out, there are a host of factors other than ease of programmability
or even quality of results that weigh against any change. To reiterate, all
companies have a huge infrastructure of **validated SAS code** that would
have to be replaced. This, in itself, would take years and cost tens of
millions of dollars at least. Also to reiterate, it's not only
statistical/reporting functionality but even more the integration into the
existing clinical database systems that would have to be rewritten **and
validated**. All this would have to be done while continuing full steam on
existing submissions. It is therefore not surprising to me that no pharma
company in its right mind even contemplates undertaking such an effort.

To put these things into perspective. Let's say Pfizer has 200 SAS
programmers (it's probably more, as they are a large Pharma, but I dunno).
If each programmer costs, conservatively, $200K U.S. per year fully loaded,
that's $40 million U.S. for SAS Programmers. And this is probably a severe
underestimate. So the $14M quoted below is chicken feed -- it doesn't even
make the radar. 


To add further perspective, a single (large) pivotal clinical trial can
easily cost $250M . A delay in approval due to fooling around trying to
shift to a whole new software system could easily cause hundreds of million
to billions if it means a competitor gets to the marketplace first. So, to
repeat, SAS costs are chicken feed.

Yes, I suppose that the present system institutionalizes mediocrity. How
could it be otherwise in any such large scale enterprise? Continuity,
reliability, and robustness are all orders of magnitude more important for
both the FDA and Pharma to get safe and efficacious drugs to the public.
Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate
here!) would be dangerous, unacceptable, and would probably delay drug
approvals. I consider this another example of the Kuhnsian paradigm (Thomas
Kuhn: "The Structure of Scientific Revolutions")in action.

This is **not** to say that there is not a useful role for R (or STATA or
...) to play in clinical trial submissions or, more generally, in drug
research and development. There certainly is. For the record, I use R
exclusively in my (nonclinical statistics) work. Nor is to say that all
change must be avoided. That would be downright dangerous. But let's please
keep these issues in perspective. One's enthusiasm for R's manifold virtues
should not replace common sense and logic. That, too, would be unfortunate.

Since I've freely blustered, I am now a fair target. So I welcome forceful
rebuttals and criticisms and, as I've said what I wanted to, I will not
respond. You have the last word.

Bert Gunter
Genentech Nonclinical Biostatistics
 



 -Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Frank E Harrell Jr
Sent: Thursday, February 18, 2010 6:01 AM
To: bill.venab...@csiro.au
Cc: r-help@r-project.org; p.dalga...@biostat.ku.dk
Subject: Re: [R] Use of R in clinical trials

I really like both of your responses.  To add to Peter's thoughts, I've 
found that more than half of SAS programmers can learn modern 
programming languages given a push.  And if pharmaceutical companies 
ever knew the true cost of SAS in terms of their having to hire more 
programmers to deal with an archaic language they would be astonished. 
Rumor had it that Pfizer's yearly SAS licensing costs were $14M/year 
several years ago.  Programmer costs were probably in the same range.


Frank


bill.venab...@csiro.au wrote:

I can't believe I'm saying this, but I think Peter is being a bit harsh on
SAS.  

I prefer Greg Snow's analogy (in the fortune collection): If SPSS (or SAS)

and R were vehicles, SPSS would be the bus, going on fixed routes and
efficiently carrying lots of people to standard places, whereas R is the
off-road 4WD SUV, complete with all sorts of kit including walking boots,
kayak on the top, &c.  R will take you anywhere you want to go, but it might
take you longer to master it than the simple recipes for data analysis
typical of the 'bus' programs.


Bill Venables
CSIRO/CMIS Cleveland

Re: [R] Use of R in clinical trials

2010-02-18 Thread Bert Gunter
DISCLAIMER: This represents my personal view and in no way reflects that of
my company.

Warning: This is a long harangue that contains no useful information on R.
May be wise to delete without reading. 
--

Sorry folks, I still don't understand your comments. As Cody's original post
pointed out, there are a host of factors other than ease of programmability
or even quality of results that weigh against any change. To reiterate, all
companies have a huge infrastructure of **validated SAS code** that would
have to be replaced. This, in itself, would take years and cost tens of
millions of dollars at least. Also to reiterate, it's not only
statistical/reporting functionality but even more the integration into the
existing clinical database systems that would have to be rewritten **and
validated**. All this would have to be done while continuing full steam on
existing submissions. It is therefore not surprising to me that no pharma
company in its right mind even contemplates undertaking such an effort.

To put these things into perspective. Let's say Pfizer has 200 SAS
programmers (it's probably more, as they are a large Pharma, but I dunno).
If each programmer costs, conservatively, $200K U.S. per year fully loaded,
that's $40 million U.S. for SAS Programmers. And this is probably a severe
underestimate. So the $14M quoted below is chicken feed -- it doesn't even
make the radar. 

To add further perspective, a single (large) pivotal clinical trial can
easily cost $250M . A delay in approval due to fooling around trying to
shift to a whole new software system could easily cause hundreds of million
to billions if it means a competitor gets to the marketplace first. So, to
repeat, SAS costs are chicken feed.

Yes, I suppose that the present system institutionalizes mediocrity. How
could it be otherwise in any such large scale enterprise? Continuity,
reliability, and robustness are all orders of magnitude more important for
both the FDA and Pharma to get safe and efficacious drugs to the public.
Constantly hopping onto the latest and greatest "craze" (yes, I exaggerate
here!) would be dangerous, unacceptable, and would probably delay drug
approvals. I consider this another example of the Kuhnsian paradigm (Thomas
Kuhn: "The Structure of Scientific Revolutions")in action.

This is **not** to say that there is not a useful role for R (or STATA or
...) to play in clinical trial submissions or, more generally, in drug
research and development. There certainly is. For the record, I use R
exclusively in my (nonclinical statistics) work. Nor is to say that all
change must be avoided. That would be downright dangerous. But let's please
keep these issues in perspective. One's enthusiasm for R's manifold virtues
should not replace common sense and logic. That, too, would be unfortunate.

Since I've freely blustered, I am now a fair target. So I welcome forceful
rebuttals and criticisms and, as I've said what I wanted to, I will not
respond. You have the last word.

Bert Gunter
Genentech Nonclinical Biostatistics
 


 -Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Frank E Harrell Jr
Sent: Thursday, February 18, 2010 6:01 AM
To: bill.venab...@csiro.au
Cc: r-help@r-project.org; p.dalga...@biostat.ku.dk
Subject: Re: [R] Use of R in clinical trials

I really like both of your responses.  To add to Peter's thoughts, I've 
found that more than half of SAS programmers can learn modern 
programming languages given a push.  And if pharmaceutical companies 
ever knew the true cost of SAS in terms of their having to hire more 
programmers to deal with an archaic language they would be astonished. 
Rumor had it that Pfizer's yearly SAS licensing costs were $14M/year 
several years ago.  Programmer costs were probably in the same range.

Frank


bill.venab...@csiro.au wrote:
> I can't believe I'm saying this, but I think Peter is being a bit harsh on
SAS.  
> 
> I prefer Greg Snow's analogy (in the fortune collection): If SPSS (or SAS)
and R were vehicles, SPSS would be the bus, going on fixed routes and
efficiently carrying lots of people to standard places, whereas R is the
off-road 4WD SUV, complete with all sorts of kit including walking boots,
kayak on the top, &c.  R will take you anywhere you want to go, but it might
take you longer to master it than the simple recipes for data analysis
typical of the 'bus' programs.
> 
> 
> Bill Venables
> CSIRO/CMIS Cleveland Laboratories
> 
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Peter Dalgaard
> Sent: Thursday, 18 February 2010 5:55 PM
> To: Frank E Harrell Jr
> Cc: r-help@r-project.org; Cody Hamilton
> Subject: Re: [R] Use of R in clinical trials
> 
> Frank E Harrell Jr wrote:
>> Cody,
>>
>> How amazing that SAS is still used to produce reports that reviewers 
>> hate and that requires tedious low-level programming.  R + LaTeX has it 
>> a

Re: [R] Checking the assumptions for a proper GLM model

2010-02-18 Thread David Winsemius
At one time the "answer" would have been to buy a copy of Venables and  
Ripley's "Modern Applied Statistics with S" (and R), and that would  
still be a sensible strategy. There are now quite a few other R- 
centric texts that have been published in the last few years. Search  
Amazon if needed. You seem to be asking for a tutorial on general  
linear modeling (which if you read the Posting Guide you will find is  
not a service offered by the r-help list.)  Perhaps you should have  
edited the link you provided in the obvious fashion:


http://www.ats.ucla.edu/stat/R/

Perhaps one of these pages:
http://www.ats.ucla.edu/stat/R/dae/default.htm

The UCLA Statistics website used to be dismissive of R, but they more  
recently appear to have seen the light. There is also a great amount  
of contributed teaching material on CRAN:


http://cran.r-project.org/other-docs.html

... and more would be readily available via Googling with "r-project"  
as part of a search strategy. Frank Harrell's material is in  
particular quite useful:


http://biostat.mc.vanderbilt.edu/wiki/Main/StatComp

--
David.

On Feb 18, 2010, at 8:32 AM, Jay wrote:


So what I'm looking for is readily available tools/packages that could
produce some of the following:

3.6 Summary of Useful Commands (STATA: Source:
http://www.ats.ucla.edu/stat/Stata/webbooks/logistic/chapter3/statalog3.htm)

   * linktest--performs a link test for model specification, in our
case to check if logit is the right link function to use. This command
is issued after the logit or logistic command.



and performs nonlinearity test.

But, since I'm new to GLM, I owuld greatly appreciate how you/others
go about and test the validity of a GLM model.


On Feb 18, 1:18 am, Jay  wrote:

Hello,

Are there any packages/functions available for testing the  
assumptions

underlying assumptions for a good GLM model? Like linktest in STATA
and smilar. If not, could somebody please describe their work process
when they check the validity of a logit/probit model?

Regards,
Jay





David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AFTREG with ID argument

2010-02-18 Thread Philipp Rappold

Göran, no worries - your help & advice is already invaluable!

Göran Broström wrote:

2010/2/18 Philipp Rappold :

Göran, David,

in order to adapt aftreg to my needs I wrote a little function that I would
like to share with you and the community.


I once promised to fix this 'asap'. Now I promise to do it tonight. OK?

Göran



WHAT DOES IT FIX?

(1) Using the id-argument in combination with missing values on covariates
wasn't easily possible before because the id-dataframe and the
data-dataframe had different sizes and aftreg quitted with an error. My fix
makes sure that NAs are excluded from both dataframes and aftreg will run
without error here.

(2) The id-argument was required to be specified by its "absolute path" (eg.
id=testdata$groupvar, see below in this thread). My adapted funtion takes
the name of the id-variable as a string, eg. id="groupvar".


HOW DOES IT WORK?

Use function aftreg2 just like you would use aftreg. Mandatory arguments
are: formula, data and id, where id is a string variable. Example:


testdata

 start stop censor groupvar  var1
1 01  01 0.1284928
2 12  01 0.4896125
3 23  01 0.7012899
4 34  01NA
5 01  02 0.7964361
6 12  02 0.8466039
7 23  12 0.2234271

model1 <- aftreg(Surv(start, stop, censor)~var1, data=testdata, id=groupvar)

ERROR.

model2 <- aftreg2(Surv(start, stop, censor)~var1, data=testdata,
id="groupvar")

WORKS FINE.


PREREQUISITES:

(1) Make sure that missing values are only present at the end of a lifetime.
The regression will yield false results if you have missing covariate data
in the middle of a lifetime. For instance: known covariates from liftetime
0-10, 13-20, but not from 11-12. (Göran: Please correct me if I'm wrong
here!).

(2) If you have missing covariate data at the beginning of a lifetime (eg.
missing from 0-5, but present from 6-censoring), this fix will yield false
results if one _cannot_ assume that the missing covariates were the same
from 0-5 as they were at 6. (Göran: Please correct me again if I'm wrong
here with my interpretation, but that's basically what you said before)


LISTING:

aftreg2 <- function(formula, data, id, ...){

   call <- match.call()

   non_na_cols <- attr(attr(terms(formula), "factors"),
"dimnames")[2][[1]]

   data <- data[complete.cases(data[non_na_cols]),]
   data <- data[complete.cases(data[id]),]

   cat("Original Call: ")
   print(call)

   return(aftreg(formula=formula, data=data, id=data[,id], ...))
}


Hope someone finds this interesting.

All the best
Philipp


David Winsemius wrote:

On Feb 11, 2010, at 5:58 AM, Philipp Rappold wrote:


Göran, thanks!

One more thing that I found: As soon as you have at least one NA in the
independent vars, the trick that you mentioned does not work anymore.
Example:


testdata

 start stop censor groupvar  var1
1 01  01 0.1284928
2 12  01 0.4896125
3 23  01 0.7012899
4 34  01NA
5 01  02 0.7964361
6 12  02 0.8466039
7 23  12 0.2234271


aftreg(Surv(start, stop, censor)~var1, data=testdata,
id=testdata$groupvar)

Error in order(id, Y[, 1]) : Different length of arguments (* I
translated this from the German Output *)

Do you think there is a simple hack which excludes all subjects that have
at least on NA in their independent vars? If it was only one dependent var
it would probably be easy by just using subset, but I have lots of different
combinations of vars that I'd like to test ;)


I don't know if it's a "hack", but there are a set of functions that
perform such subsetting:

?na.omit

There is a parameter that would accomplish that goal inside aftreg. You
may want to check what your defaults are for na.action.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] row indexes from logical statment

2010-02-18 Thread Dennis Murphy
Hi:

You might also want to consider the use of subset, as in

subset(foo, name == "A")or
subset(foo, name %in% c("A", "B"))

HTH,
Dennis

On Thu, Feb 18, 2010 at 8:01 AM, stephen sefick  wrote:

> Is there any easy way to pull out the row indexes for a logical
> matching statment?
>
> #example code#
> foo <- data.frame(name=c(rep("A", 25), rep("B", 25), rep("C", 25),
> rep("A", 25)), stuff=rnorm(100), and=rnorm(100), things=rnorm(100))
>
> #this is what I want but I would like the row indexes
> foo[foo[,1]==A,]
> ##
>
> Also, is there a way to get both A or B into the logical statment
>
> thanks so much for all of your help
>
> Stephen
>
> --
> Stephen Sefick
>
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods.  We are mammals, and have not exhausted the
> annoying little problems of being mammals.
>
>-K. Mullis
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >