Re: [R] predict.lm(...,type="terms") question

2012-08-31 Thread David Winsemius


On Aug 31, 2012, at 3:48 PM, jjthaden wrote:


On Aug 30, 2012, at 4:35 PM,  David Windemius wrote:

David said my newdata data frame 'new' must have a column named  
'area'.
It did. Nonetheless predict.lm throws an error with type = "terms"  
and

newdata = new. I see nothing in the predict.lm  documentation that
bars this usage. Is there a bug?


After correcting the error in your definition of the 'area' vector I
get no error from predict.lm().



David.


What error did you correct?


This was the code you offered:

#Ludbrook's data set S1 (except renaming
#his 'x' as 'concn' and his 'y' as 'area')
S1 <- data.frame(
area = c(2.4,2.6,6.0,6.5,8.9,),
concn = c(1.1,4,5,8.5,8.5))

There's an extra comma in the "area" vector.



The newdata data frames in my examples have been pretty simple, and  
have

defined the 'area' vector simply, for instance,
new <- data.frame(area = c(8172, 10220, 11570, 24150))
new
#area
#1  8172
#2 10220
#3 11570
#4 24150

Is something wrong with this?


I don't really know    this appears to be a different problem than  
you posed earlier. If you would learn to post with context, we might  
not be in the position of trying to read your mind.


 Were you able to make predict.lm work with newdata = new and type =  
"terms"?


Yes. I have no trouble doing so.

>  x <- rnorm(15); x2=rnorm(15)
>  y <- x + x2 +rnorm(15)
>  fit <- lm(y ~ x+x2)
>  new <- data.frame(x2 = seq(-3.5, 3.5, 0.5)  )
>  predict(fit, newdata=new,type="terms", terms="x2")
   x2
1  -4.9230035
2  -4.1850588
3  -3.4471141
4  -2.7091694
5  -1.9712248
6  -1.2332801
7  -0.4953354
8   0.2426093
9   0.9805539
10  1.7184986
11  2.4564433
12  3.1943880
13  3.9323326
14  4.6702773
15  5.4082220
attr(,"constant")
[1] 0.4501911




-John



Sent from the (NOT) R help mailing list  (and NOT) archive at  
Nabble.com.


Hmmm ... Nabble ... definitely part of the problem.

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predict.lm(...,type="terms") question

2012-08-31 Thread David Winsemius


On Aug 31, 2012, at 5:27 PM, jjthaden wrote:

The error is generated in the last line of code shown here from  
predict.lm


The code you offered had an extra comma in the 'area' vector. Removing  
allowed the fitting and prediction to proceed without error, I do not  
see why you are producing code from predict.lm.






predict.lm

function (object, newdata, se.fit = FALSE, scale = NULL, df = Inf,
   interval = c("none", "confidence", "prediction"), level = 0.95,
   type = c("response", "terms"), terms = NULL, na.action = na.pass,
   pred.var = res.var/weights, weights = 1, ...)
{
   tt <- terms(object)
   if (!inherits(object, "lm"))
   warning("calling predict.lm() ...")
   if (missing(newdata) || is.null(newdata)) {
   mm <- X <- model.matrix(object)
   mmDone <- TRUE
   offset <- object$offset
   }
   else {
   Terms <- delete.response(tt)
   m <- model.frame(Terms, newdata, na.action = na.action,
   xlev = object$xlevels)
#MORE LINES DELETED HERE...

As written, if newdata is specified but it doesn't contain every  
term  in

the original data frame, then an error is thrown.


That statement is also not reproducible. To gain further attention you  
should offer code that demonstrates what errors you are seeing.


x <- rnorm(15); x2=rnorm(15)
 y <- x + x2 +rnorm(15)
 fit <- lm(y ~ x+x2)
 new <- data.frame(x = seq(-3.5, 3.5, 0.5)  )
 predict(fit, newdate=new)
# no error is thrown.

Given the fact that your earlier effort had an obvious syntax error  
that had absolutely nothing to do with predict.lm argument, I would  
ask that you conduct better testing of code that you offer for review.  
You now show no evidence for your assertion about the source of the  
error. Where is the error message and the results of traceback()?  
Where is the data. Where is the code?


(And please include context. (It is not difficult to do so with Nabble.)


But for predict.lm to
predict values for a term, i.e., type = "term", those terms cannot  
be in the

newdata vector.
-John




--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splits with 0s in middle columns

2012-08-31 Thread arun
HI,
You can also use ifelse:
dat1<-read.table(text="
10/20/30
40/20
60/10/10/5
80/10
",sep="",header=FALSE,stringsAsFactors=FALSE)
dat2<-ifelse(nchar(gsub("[^/]","",dat1$V1))==1,gsub("(.*)/(.*)","\\1 0 0 
\\2",dat1$V1),ifelse(nchar(gsub("[^/]","",dat1$V1))==2,gsub("(.*)/(.*)/(.*)","\\1
 \\2 0 \\3",dat1$V1),gsub("(.*)/(.*)/(.*)/(.*)", "\\1 \\2 \\3 \\4",dat1$V1)))
 dat3<-data.frame(do.call(rbind,strsplit(dat2, " ")))
 colnames(dat3)<-paste0("A",1:4)
 dat3
#  A1 A2 A3 A4
#1 10 20  0 30
#2 40  0  0 20
#3 60 10 10  5
#4 80  0  0 10
A.K.



- Original Message -
From: Sapana Lohani 
To: R help 
Cc: 
Sent: Friday, August 31, 2012 12:10 PM
Subject: [R] splits with 0s in middle columns

Hi, 

A column of my df looks like

A
10/20/30

40/20
60/10/10/5
80/10

I want to split it such that the last column has the last composition and if 
there are not enough the middle columns get the 0s. That way my df would look 
like

A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10

How can I do that ??

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splits with 0s in middle columns

2012-08-31 Thread arun
Hi,
Try this:
dat1<-read.table(text="
10/20/30
40/20
60/10/10/5
80/10
",sep="",header=FALSE,stringsAsFactors=FALSE)
dat2<-gsub("(.*)/(.*)","\\1 0 0 \\2", gsub("(.*)/(.*)/(.*)","\\1 \\2 0 \\3", 
gsub("(.*)/(.*)/(.*)/(.*)","\\1 \\2 \\3 \\4",dat1$V1)))
dat3<-data.frame(do.call(rbind,strsplit(dat2, " ")))
 colnames(dat3)<-paste0("A",1:4)
 dat3
#  A1 A2 A3 A4
#1 10 20  0 30
#2 40  0  0 20
#3 60 10 10  5
#4 80  0  0 10
A.K.




- Original Message -
From: Sapana Lohani 
To: R help 
Cc: 
Sent: Friday, August 31, 2012 12:10 PM
Subject: [R] splits with 0s in middle columns

Hi, 

A column of my df looks like

A
10/20/30

40/20
60/10/10/5
80/10

I want to split it such that the last column has the last composition and if 
there are not enough the middle columns get the 0s. That way my df would look 
like

A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10

How can I do that ??

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R package fields: Thin-plate splines question

2012-08-31 Thread Rebecca Clark
Greetings, R helpers,

I have been using the Tps() function in the fields package to model response 
surfaces for some dietary research in crickets.  Overall, I have *three* 
independent variables of interest, two continuous variables ("protein" and 
"carbohydrate") and one categorical variable (cricket "morph").  My overall 
goal is to produce two heat maps where colors are scaled on the same absolute 
scale.  As best I see it, there are two possible ways to achieve this: some 
adjustment to the image() function parameters, or some adjustment to the Tps() 
function and subsequent method to separate the two plots (this would also 
require rearrangement of the data frame).  

However, I am stumped on the particulars.  Does anyone have an idea of how to 
do this?  I am including (hopefully reproducible) syntax for the two separate 
surfaces below.

Thank you,
Rebecca Clark
Postdoctoral Research Associate
Texas A&M University


##

library(fields)

triact<-structure(list(Diet = structure(1:13, .Label = c("A", "B", "C", "D", 
"E", "F", "G", "H", "I", "J", "K", "L", "M"), class = "factor"), LW = c(19, 14, 
9, 19, 13, 6.8, 16.9, 2.4, 18.1, 3.5, 16, 8.3, 11), SW = c(5.5, 8.5, 11.7, 6, 
3.5, 4, 5.5, 3.2, 9.1, 3.1, 7.5, 3, 3.5), protein = c(18, 28, 8, 28.75, 4, 
17.25, 13, 14, 16.25, 9, 27, 9.75, 23), carbohydrate = c(24, 14, 34, 23.75, 17, 
14.25, 29, 7, 36.25, 12, 36, 21.75, 19)), .Names = c("Diet", "LW", "SW", 
"protein", "carbohydrate"), class = "data.frame", row.names = c(NA, -13L))

attach(triact)

surf.teLW=Tps(cbind(protein,carbohydrate),LW, lambda = 0.01)
surf.teSW=Tps(cbind(protein,carbohydrate),SW, lambda = 0.01)

# Method for separately producing the surface and then mapping contour lines 
onto it (instead of using "surface()"):
surf.te.outLW=predict.surface(surf.teLW)
surf.te.outSW=predict.surface(surf.teSW)

par(mfrow=c(1,2))

image(surf.te.outLW, col=tim.colors(128), lwd=5, las=1, font.lab=2, 
cex.lab=1.3, mgp=c(2.7,0.5,0), font.axis=1, lab=c(4,5,6), 
xlab=expression("Protein content"), ylab=expression("Carbohydrate 
Content"),main="Triglyceride (LW)", asp=1, xlim=c(0,38), ylim=c(0,38))
contour(surf.te.outLW, lwd=2, labcex=1, add=T)

image(surf.te.outSW, col=tim.colors(128), lwd=5, las=1, font.lab=2, 
cex.lab=1.3, mgp=c(2.7,0.5,0), font.axis=1, lab=c(4,5,6), 
xlab=expression("Protein content"), ylab=expression("Carbohydrate 
Content"),main="Triglyceride (SW)", asp=1, xlim=c(0,38), ylim=c(0,38))
contour(surf.te.outSW, lwd=2, labcex=1, add=T)

# Note how different the scales are between the two images, as indicated by the 
contour lines, but the color ranges are the same.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predict.lm(...,type="terms") question

2012-08-31 Thread jjthaden
On Aug 30, 2012, at 4:35 PM,  David Windemius wrote:

>> David said my newdata data frame 'new' must have a column named 'area'.
>> It did. Nonetheless predict.lm throws an error with type = "terms" and
>> newdata = new. I see nothing in the predict.lm  documentation that
>> bars this usage. Is there a bug?
>
> After correcting the error in your definition of the 'area' vector I  
> get no error from predict.lm().

> David.

What error did you correct? 

The newdata data frames in my examples have been pretty simple, and have
defined the 'area' vector simply, for instance,
new <- data.frame(area = c(8172, 10220, 11570, 24150))
new
#area
#1  8172
#2 10220
#3 11570
#4 24150

Is something wrong with this? Were you able to make predict.lm work with
newdata = new and type = "terms"?

-John




--
View this message in context: 
http://r.789695.n4.nabble.com/predict-lm-type-terms-question-tp4641665p4641947.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional merging in R & if then statement

2012-08-31 Thread Jeff Newmiller
1) just because you don't use sqldf doesn't make it better. merge only does the 
join, not the conditions, so you have to apply them after merge. Untested (you 
should provide a complete example):

result <- merge ( detail, pdetail, by=TDATE, suffixes=c(".a",".b") )
result <- subset ( result, STIM.a >= STIM.b and STIM.a <=MAXTIM.b ) 

You can add a select argument to subset to remove the pdetail columns.

2) if is different than ifelse ... read the Introduction to R document, and/or 
read the help pages on if and ifelse.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

ramoss  wrote:

>1)I am wandering how the following SQL statement can be written in R
>language
>w/o using sqldf:
>create table  detail2 as
>select a.*
>from   detail a,
>   pdetail b
>where a.TDATE=b.TDATE
>and(a.STIM >= b.STIM and a.STIM <=b.MAXTIM)
>
>2) when try if then in R it only applies to the 1st row & not to whole
>dataset like in SAS. How do you get round that?
>in SAS:
>
>data summary;
>  set all1;
>  if entry='a:prop' then pctexec=stkful/stocks*100;
>run;
>
>
>
>Thanks in advance for your help.
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/Conditional-merging-in-R-if-then-statement-tp4641936.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predict.lm(...,type="terms") question

2012-08-31 Thread jjthaden
The error is generated in the last line of code shown here from predict.lm

> predict.lm
function (object, newdata, se.fit = FALSE, scale = NULL, df = Inf, 
interval = c("none", "confidence", "prediction"), level = 0.95, 
type = c("response", "terms"), terms = NULL, na.action = na.pass, 
pred.var = res.var/weights, weights = 1, ...) 
{
tt <- terms(object)
if (!inherits(object, "lm")) 
warning("calling predict.lm() ...")
if (missing(newdata) || is.null(newdata)) {
mm <- X <- model.matrix(object)
mmDone <- TRUE
offset <- object$offset
}
else {
Terms <- delete.response(tt)
m <- model.frame(Terms, newdata, na.action = na.action, 
xlev = object$xlevels)
#MORE LINES DELETED HERE...

As written, if newdata is specified but it doesn't contain every term  in
the original data frame, then an error is thrown. But for predict.lm to
predict values for a term, i.e., type = "term", those terms cannot be in the
newdata vector. 
-John



--
View this message in context: 
http://r.789695.n4.nabble.com/predict-lm-type-terms-question-tp4641665p4641951.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread Thomas Lumley
On Sat, Sep 1, 2012 at 4:27 AM, David Winsemius  wrote:
>
> On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:
>
>> Hi all,
>>
>> I want to know how to perform the test Breslow-Day test for homogeneity of
>> odds ratios (OR) stratified for svytable. This test is obtained with the 
>> following code:
>>
>> epi.2by2 (dat = daty, method = "case.control" conf.level = 0.95,
>
> missing comma here ...^
>
>>units = 100, homogeneity = "breslow.day", verbose = TRUE)
>>
>> where "daty" is the object type table  svytable consider it, but when I run 
>> the code
>> does not throw the homogeneity test.
>
> You are asked in the Posting guide to copy all errors and warnings when 
> asking about unexpected behavior. When I run epi.2y2 on the output of a 
> syvtable object I get no errors, but I do get warnings which I think are due 
> to non-integer entries in the weighted table. I also get from a svytable() 
> usingits first example on the help page an object that is NOT a set of 2 x 2 
> tables in an array of the structure as expected by epi.2by2(). The fact that 
> epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
> error checking is weak.
>
> This is the output of str(dat) from one of the example on epi.2by2's help 
> page:
>
>> str(dat)
>  table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
>  - attr(*, "dimnames")=List of 3
>   ..$ Exposure: chr [1:2] "+" "-"
>   ..$ Disease : chr [1:2] "+" "-"
>   ..$ Strata  : chr [1:3] "20-29 yrs" "30-39 yrs" "40+ yrs"
>
> Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
> reading the help pages and using str() to look at the objects created to get 
> an idea regarding success or failure. I am not an experienced user of either 
> package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
> another example you can build a 2 x 2 x n table from the built-in dataset: 
> UCBAdmissions
>
> DF <- as.data.frame(UCBAdmissions)
> ## Now 'DF' is a data frame with a grid of the factors and the counts
> ## in variable 'Freq'.
> dat2 <- xtabs(Freq ~ Gender + Admit+Dept, DF)
> epiR::epi.2by2(dat = dat2, method = "case.control", conf.level = 0.95,
>  units = 100, homogeneity = "breslow.day", verbose = TRUE)$OR.homog
> #-
>   test.statistic dfp.value
> 1   18.82551  5 0.00207139
>
> Using svydesign and svytable I _think_ this is how one would go about 
> constructing a 2 x 2 table:
>
> tbl2<-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
>   summary(dclus1)
> (tbl2by2 <- svytable(~ Gender + Admit+Dept, tbl2))
>  epiR::epi.2by2(dat = tbl, method = "case.control", conf.level = 0.95,
>  units = 100, homogeneity = "breslow.day", verbose = TRUE)$OR.homog
> #---
>   test.statistic dfp.value
> 1   18.82551  5 0.00207139
>
> (At least I got internal consistency. I see you copied Thomas Lumley, which 
> is a good idea. I'll be happy to get corrected on any point. I'm adding the 
> maintainer of epiR to the recipients.)
>

Yes, that will give internal consistency from a data structure point
of view.  It won't give a valid test in real examples, though --
epi.2by2 doesn't know about complex sampling, and what you're passing
it is just an estimate of the population 2x2xK table.

What would work, though it's not quite the same as the Breslow-Day
test, is to use svyloglin() and do a Rao-Scott test comparing the
model with all two-way interactions ~(Gender+Dept+Admit)^2 to the
saturated model ~Gender*Dept*Admit.

-thomas


-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using apply with sparse matrix from package Matrix

2012-08-31 Thread Jennifer Lyon
Hi:

I was trying to use apply on a sparse matrix from package Matrix,
and I get the error:

Error in asMethod(object) :
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 106

Is there a way to apply a function to all the rows without bumping
into this problem?

Here is a simplified example:

> dim(sm)
[1] 72913 43052

> class(sm)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"

> str(sm)
Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  ..@ i   : int [1:6590004] 789 801 802 1231 1236 11739 17817
17943 18148 18676 ...
  ..@ p   : int [1:43053] 0 147 303 450 596 751 908 1053 1188 1347 ...
  ..@ Dim : int [1:2] 72913 43052
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x   : num [1:6590004] 0.601 0.527 0.562 0.641 0.684 ...
  ..@ factors : list()

> my.sum<-apply(sm, 1, sum)
Error in asMethod(object) :
  Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 106

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Matrix_1.0-6   lattice_0.20-6 stringr_0.6.1

loaded via a namespace (and not attached):
[1] grid_2.15.1 plyr_1.7.1

Thanks.

Jen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find the index of points selected from a scatter plot?

2012-08-31 Thread Peter Ehlers

On 2012-08-31 16:03, Bert Gunter wrote:

?which ##
as in ix <- which(x==values)

-- Bert


Or maybe ?identify.

Peter Ehlers



On Fri, Aug 31, 2012 at 2:09 PM, Michael  wrote:


Hi all,

I am using "locator" to select the points from a scatter plot...

This is all fine.

But the problem is that the locator only returns the axis values of the
selected points.

Instead, I would like to get the index of these select points...

The axis values are real-values so it's a bit hard for me to directly
reverse-engineer the index nubmers..

How to do that?

Thank you!

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find the index of points selected from a scatter plot?

2012-08-31 Thread Bert Gunter
?which ##
as in ix <- which(x==values)

-- Bert

On Fri, Aug 31, 2012 at 2:09 PM, Michael  wrote:

> Hi all,
>
> I am using "locator" to select the points from a scatter plot...
>
> This is all fine.
>
> But the problem is that the locator only returns the axis values of the
> selected points.
>
> Instead, I would like to get the index of these select points...
>
> The axis values are real-values so it's a bit hard for me to directly
> reverse-engineer the index nubmers..
>
> How to do that?
>
> Thank you!
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splits with 0s in middle columns

2012-08-31 Thread Rui Barradas

Hello,

It means you should update your version of R. paste0 was introduced with 
R 2.15.0 as is the shorter equivalent of


paste(..., sep = "")

I'm sending you a small correction just in case some of the elements in 
A only have 1 element.


fun <- function(X){
xname <- deparse(substitute(X))
s <- strsplit(X, "/")
n <- max(sapply(s, length))
tmp <- numeric(n)

f <- function(x){
x <- as.numeric(x)
m <- length(x)
if(m > 1){
tmp[n] <- x[m]
tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
}else tmp[1] <- x
tmp
}

res <- do.call(rbind, lapply(s, f))
colnames(res) <- paste(xname, seq_along(s), sep = "")
data.frame(res)
}

Rui Barradas

Em 31-08-2012 21:58, Sapana Lohani escreveu:

Hi Rui,

I am getting some error message in the code you sent. It says "Error in fun(A) : could 
not find function "paste0"

I am a new in R so could not fix it. Can you help me fix that ?

Thanks





  From: Rui Barradas 
To: Sapana Lohani 
Cc: r-help 
Sent: Friday, August 31, 2012 12:35 PM
Subject: Re: [R] splits with 0s in middle columns
  
Hello,


Try the following.

A <- c(
"10/20/30",
"40/20",
"60/10/10/5",
"80/10")

fun <- function(X){
 xname <- deparse(substitute(X))
 s <- strsplit(X, "/")
 n <- max(sapply(s, length))
 tmp <- numeric(n)

 f <- function(x){
 x <- as.numeric(x)
 m <- length(x)
 tmp[n] <- x[m]
 tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
 tmp
 }

 res <- do.call(rbind, lapply(s, f))
 colnames(res) <- paste0(xname, 1:ncol(res))
 data.frame(res)
}
fun(A)

Hope this helps,

Rui Barradas

Em 31-08-2012 17:10, Sapana Lohani escreveu:

Hi,

A column of my df looks like

A
10/20/30

40/20
60/10/10/5
80/10

I want to split it such that the last column has the last composition and if 
there are not enough the middle columns get the 0s. That way my df would look 
like

A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10

How can I do that ??

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find the index of points selected from a scatter plot?

2012-08-31 Thread Michael
Hi all,

I am using "locator" to select the points from a scatter plot...

This is all fine.

But the problem is that the locator only returns the axis values of the
selected points.

Instead, I would like to get the index of these select points...

The axis values are real-values so it's a bit hard for me to directly
reverse-engineer the index nubmers..

How to do that?

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional merging in R & if then statement

2012-08-31 Thread David Winsemius

On Aug 31, 2012, at 10:11 AM, ramoss wrote:

> 1)I am wandering how the following SQL statement can be written in R language
> w/o using sqldf:
> create table  detail2 as
> select a.*
> from   detail a,
>   pdetail b
> where a.TDATE=b.TDATE
> and(a.STIM >= b.STIM and a.STIM <=b.MAXTIM)
> 
> 2) when try if then in R

You need to provide code that shows what you did. If you had the output of 

dat <- merge(a, b, by=TDATE)

You could then limit your data depending on the names of the columns and 
whether MATIM was a duplicate name in "a" and "b" with something along the 
lines of

subset(dat, STIM.x >= STIM.y & STIM.s <=MAXTIM.y)


> it only applies to the 1st row

Yes, that is how "if" is used in R. You need to educate yourself about R rather 
than assuming the similar words are necessarily the same function. read:

?"if"
?ifelse


> & not to whole
> dataset like in SAS. How do you get round that?
> in SAS:
> 
> data summary;
>  set all1;
>  if entry='a:prop' then pctexec=stkful/stocks*100;
> run;
> 
-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply --> data.frame

2012-08-31 Thread Sam Steingold
> * William Dunlap  [2012-08-31 18:38:52 +]:
>
> Is the following something like what you are doing?

yes, absolutely, thanks a lot!


-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://pmw.org.il http://dhimmi.com
http://palestinefacts.org http://www.memritv.org http://mideasttruth.com
char*a="char*a=%c%s%c;main(){printf(a,34,a,34);}";main(){printf(a,34,a,34);}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splits with 0s in middle columns

2012-08-31 Thread Rui Barradas

Hello,

Try the following.

A <- c(
"10/20/30",
"40/20",
"60/10/10/5",
"80/10")

fun <- function(X){
xname <- deparse(substitute(X))
s <- strsplit(X, "/")
n <- max(sapply(s, length))
tmp <- numeric(n)

f <- function(x){
x <- as.numeric(x)
m <- length(x)
tmp[n] <- x[m]
tmp[seq_len(m - 1)] <- x[seq_len(m - 1)]
tmp
}

res <- do.call(rbind, lapply(s, f))
colnames(res) <- paste0(xname, 1:ncol(res))
data.frame(res)
}
fun(A)

Hope this helps,

Rui Barradas

Em 31-08-2012 17:10, Sapana Lohani escreveu:

Hi,

A column of my df looks like

A
10/20/30

40/20
60/10/10/5
80/10

I want to split it such that the last column has the last composition and if 
there are not enough the middle columns get the 0s. That way my df would look 
like

A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10

How can I do that ??

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating an array from correlation matrices

2012-08-31 Thread Rui Barradas

Hello,

You create a 3d array X and then index it as if it were 1d.
Correction:

TS <- EuStockMarkets

[...etc...]

for (t in T[1:(length(T)-n)]){
 X[ , , t] <- cor(TS[t:(t+n), 1:ncol(TS)], use = "pairwise.complete.obs")
}
# Calculate correlation matrices


Also, 't' and 'T' are not good names, the first is R's matrix transpose 
function and the second one is another name for TRUE.


Hope this helps,

Rui Barradas

Em 31-08-2012 17:24, Max Frisch escreveu:

Hello everyone,

a hopefully easy to solve problem from an R novice...

I try to calculate a number of correlation matrices that finally should be 
combined in a three-dimensional array.
Here the my code with an R dataset as an example.

---

## Creation an array of correlation matrices from a rolling window application

TS <- EuStockMarkets
# Load internal dataset

n <- 30
# Choose size of rolling time window

T <- c(1:nrow(TS))
# Define number of steps

X <- array(data = NA, dim = c(ncol(TS), ncol(TS), nrow(TS)))
# Create data array

for (t in T[1:(length(T)-n)]){
  X[t] = cor(TS[t:(t+n), 1:ncol(TS)], use = "pairwise.complete.obs")
}
# Calculate correlation matrices

-

Unfortunately, I only get a warning that the dimensions do not fit... Where is 
the mistake?

THANKS A LOT!

Nico

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditional merging in R & if then statement

2012-08-31 Thread ramoss
1)I am wandering how the following SQL statement can be written in R language
w/o using sqldf:
create table  detail2 as
select a.*
from   detail a,
   pdetail b
where a.TDATE=b.TDATE
and(a.STIM >= b.STIM and a.STIM <=b.MAXTIM)

2) when try if then in R it only applies to the 1st row & not to whole
dataset like in SAS. How do you get round that?
in SAS:

data summary;
  set all1;
  if entry='a:prop' then pctexec=stkful/stocks*100;
run;



Thanks in advance for your help.



--
View this message in context: 
http://r.789695.n4.nabble.com/Conditional-merging-in-R-if-then-statement-tp4641936.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating an array from correlation matrices

2012-08-31 Thread Max Frisch
Hello everyone,

a hopefully easy to solve problem from an R novice...

I try to calculate a number of correlation matrices that finally should be 
combined in a three-dimensional array.
Here the my code with an R dataset as an example.

---

## Creation an array of correlation matrices from a rolling window application

TS <- EuStockMarkets
# Load internal dataset

n <- 30
# Choose size of rolling time window

T <- c(1:nrow(TS))
# Define number of steps

X <- array(data = NA, dim = c(ncol(TS), ncol(TS), nrow(TS)))
# Create data array

for (t in T[1:(length(T)-n)]){
 X[t] = cor(TS[t:(t+n), 1:ncol(TS)], use = "pairwise.complete.obs")
}
# Calculate correlation matrices

-

Unfortunately, I only get a warning that the dimensions do not fit... Where is 
the mistake?

THANKS A LOT!

Nico

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splits with 0s in middle columns

2012-08-31 Thread Sapana Lohani
Hi, 

A column of my df looks like

A
10/20/30

40/20
60/10/10/5
80/10

I want to split it such that the last column has the last composition and if 
there are not enough the middle columns get the 0s. That way my df would look 
like

A1 A2 A3 A4
10 20 0 30
40 0 0 20
60 10 10 5
80 0 0 10

How can I do that ??

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on numerical object and ifelse function

2012-08-31 Thread arun
Hi,
Try this:
z1<-c(z,z,z,z,z)
 ifelse(z1>14,x,y)
#[1] 1 2 3 4 5
A.K.



- Original Message -
From: Andras Farkas 
To: "r-help@r-project.org" 
Cc: 
Sent: Friday, August 31, 2012 7:55 AM
Subject: [R] Help on numerical object and ifelse function

Dear All,
 
this is probably an easy one but I can not get a handle on it:
 
x <-c(1,2,3,4,5)
y <-c(6,7,8,9,10)
z <-15
w <-ifelse(z>14,x,y)
 
this will give me a value of 1 for w. What I would like to get is the whole 
string of x, so that w would become a numeric object of 5 characters exactly 
the same as x.
 
Apreciate the help,
 
Sincerely,
 
Andras
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply --> data.frame

2012-08-31 Thread William Dunlap
It is hard to help when you don't give an example of your input data
and what you want to be computed (in a form one can source or copy
into an R session).  Is the following something like what you are doing?

Suppose you have a function that takes a file name and
returns a list of things of various types extracted from the
file.  A toy example would be
fileExtract <- function(fileName) {
   fi <-  file.info(fileName)
   byte0 <- if (fi$isdir || fi$size < 1) NA_integer_ else readBin(fileName, 
what="integer", size=1, n=1)
   list(Name=basename(fileName), IsDir=fi$isdir, Size=fi$size, FirstByte = 
byte0, ModTime=fi$mtime)
   } 
Then you can get the list of rows that you want converted to a data.frame
with
   rows <- lapply(dir(R.home(), full.names=TRUE), fileExtract)
E.g., I get
  > dput(rows[1:2])
  list(structure(list(Name = "bin", IsDir = TRUE, Size = 0, FirstByte = 
NA_integer_, 
  ModTime = structure(1343316337, class = c("POSIXct", "POSIXt"
  ))), .Names = c("Name", "IsDir", "Size", "FirstByte", "ModTime"
  )), structure(list(Name = "CHANGES", IsDir = FALSE, Size = 28204, 
  FirstByte = 87L, ModTime = structure(1340406834, class = c("POSIXct", 
  "POSIXt"))), .Names = c("Name", "IsDir", "Size", "FirstByte", 
  "ModTime")))
Note that the j'th element of each row has a fixed type.
You want a data.frame with columns named "Name", "IsDir",
"Size", and "FirstByte" where the i'th row contains the data in row[[i]].

If that is what you want then here is a function that does a pretty good job of 
it:
function (listOfRows, nItemsPerRow = unique(vapply(listOfRows, 
length, 0)), col.names = names(rowTemplate), rowTemplate = listOfRows[[1]], 
...) 
{
stopifnot(length(nItemsPerRow) == 1, nItemsPerRow == length(rowTemplate))
if (is.null(col.names)) {
col.names <- sprintf("V%d", seq_len(nItemsPerRow))
}
else {
stopifnot(nItemsPerRow == length(col.names))
}
columns <- lapply(structure(seq_len(nItemsPerRow), names = col.names), 
FUN = function(i) {
v <- vapply(listOfRows, function(Row) Row[[i]], rowTemplate[[i]])
if (is.matrix(v)) { # for when length(rowTemplate[[i]])>1 
v <- t(v)
}
v
})
data.frame(columns, ...)
}
E.g.,
> str(f(rows))
'data.frame':   19 obs. of  5 variables:
 $ Name : Factor w/ 19 levels "bin","CHANGES",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ IsDir: logi  TRUE FALSE FALSE TRUE TRUE TRUE ...
 $ Size : num  0 28204 18351 0 0 ...
 $ FirstByte: int  NA 87 9 NA NA NA NA 101 NA 82 ...
 $ ModTime  : num  1.34e+09 1.34e+09 1.34e+09 1.34e+09 1.34e+09 ...
Note that the POSIXct item, ModTime, got converted to numeric because
vapply didn't handle that class properly.

An advantage of vapply is that it will do some type checking:
> f(list(list(a=1,b=11), list(a=2,b="Twelve")))
Error in vapply(listOfRows, function(Row) Row[[i]], rowTemplate[[i]]) : 
  values must be type 'double',
 but FUN(X[[2]]) result is type 'character'
It will also deal with things like the following, where each row element
contains a few vectors and you want the each vector element in its
own column:
  > str(f(list(list(1:2, 1+1i, letters[1:3]), list(11:12, 11+11i, 
letters[4:6]
  'data.frame':   2 obs. of  6 variables:
   $ V1.1: int  1 11
   $ V1.2: int  2 12
   $ V2  : cplx  1+1i 11+11i
   $ V3.1: Factor w/ 2 levels "a","d": 1 2
   $ V3.2: Factor w/ 2 levels "b","e": 1 2
   $ V3.3: Factor w/ 2 levels "c","f": 1 2

There are other ways to do this, but I don't know if this is the problem
you want to solve.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Sam Steingold
> Sent: Friday, August 31, 2012 9:11 AM
> To: r-help@r-project.org; David Winsemius
> Subject: Re: [R] apply --> data.frame
> 
> > * David Winsemius  [2012-08-30 10:14:34 -0700]:
> >
> >> str( as.data.frame( do.call(rbind, strsplit(c("a,1","b,2","c,3"),
> > ",") ) , stringsAsFactors=FALSE) )
> > 'data.frame':   3 obs. of  2 variables:
> >  $ V1: chr  "a" "b" "c"
> >  $ V2: chr  "1" "2" "3"
> 
> do.call/rbind appeared to be TRT. I tried it and got a data frame with
> list columns (instead of vectors);
> 
> as.data.frame(do.call(rbind,lapply(list.files(...), function (name) {
> 
> c(name,list(num1,num2,num3), # num* come from some calculations above
>   
> strsplit(sub("[^-]*(train|test)[^-]*(-(S)?pca([0-9]*))?-s([0-9]*)c([0-9.]*)\\.score",
>"\\1,\\3,\\4,\\5,\\6",name),",")[[1]])
>   })), stringsAsFactors = FALSE)
> 
> 'data.frame': 2 obs. of  8 variables:
>  $ file:List of 2
>   ..$ : chr "zzz_test_0531_0630-Spca181-s0c10.score"
>   ..$ : chr "zzz_train_0531_0630-Spca181-s0c10.score"
>  $ lift.quality:List of 2
>   ..$ : num 0.59
>   ..$ : num 0.621
>  $ proficiency :List of 2
>   ..$ : num 0.0472
>   ..$ : num 0.0472
>  $ set

Re: [R] Class that wraps Data Frame

2012-08-31 Thread Martin Morgan
I guess there are two issues with data.frame. It comes with more than 
you probably want to support (e.g., list and matrix- like subsetter [, 
the user expecting to be able to independently modify any column). And 
it comes with less than you'd like (e.g., support for a 'column' of S4 
objects). By making a class that contains ('is a') data.frame, you 
commit to both limitations.


You're probably using data.frame as a way to implement some basic 
restrictions -- equal-length columns, for instance. But there are 
additional restrictions, too, columns x, y, z must be present and of 
type integer, character, numeric respectively. For this scenario one is 
better off implementing an S4 class (which provides type checking and 
required columns), a validity method (for enforcing the equal-length 
constraint), accessors, and sub-setting following the semantic that 
you'd like to support, e.g., just along the length of the required slots.


The richest place for this in Bioconductor is the IRanges package, 
though it can be a bit daunting from an architecture point of view. A 
couple of things to point to. One is the DataFrame class, which is like 
a data.frame but supporting a broader (in particular S4) set of columns 
and allowing 'metadata' (actually, DataFrame, so recursive) on each 
column. It is relevant if it is important to maintain S4 classes in a 
data.frame-like structure.


Another is the IRanges class, which in some ways fits your overall use 
case. It is basically a rectangular data structure, but with required 
'columns' (the start and width of the range) and then arbitrary columns 
the user can add. It's implemented with slots for start and width, and 
then 'has a' slot containing a DataFrame as 'metadata columns' (the 
actual implementation is more complicated than this). There are start 
and width accessors. Sub-setting is always list-like 
(single-dimensional, along the ranges). Users wanting to access one of 
'their' columns use $ or extract the metadata columns (via mcols()) as a 
DataFrame and then work on that. Maybe it's worth pointing out that the 
basic definitions are column-oriented, an IRanges instance contains 
start and width vectors; there is no 'IRange' class.


The GRanges class (in the GenomicRanges package) 'has a' IRanges, but 
adds additional required slots ('seqnames' to reference the names of the 
chromosome sequences to which the ranges refer, 'strand' to indicate the 
strand to which the range belongs, etc.). So the pattern here avoids the 
'is a' relationship that simple class extension would imply.


The IRanges package is at

  http://bioconductor.org/packages/devel/bioc/html/IRanges.html

I've described the 'devel' version of Bioconductor

  http://bioconductor.org/developers/useDevel/

Martin


On 08/31/2012 08:39 AM, Bert Gunter wrote:

To add to what David said ...

Of course, there are already S3 "getters" and "setters" methods for data
frames ("[.data.frame" and "[<-.data.frame" )*. These could clearly be
extended -- i.e. the data.frame class could be extended and appropriate S3
methods written. Whether you use S3 or S4 depends on the degree of control,
type checking, reuse etc. you want/need. David's suggestion to look at
Bioconductor is a good one.

Cheers,
Bert
*If you are unfamiliar with the S3 extract methods, consult the R Language
Definition Manual.

On Fri, Aug 31, 2012 at 8:14 AM, David Winsemius wrote:



On Aug 31, 2012, at 5:57 AM, Ramiro Barrantes wrote:


Hello,

I have again a "good practices"/programming theory question regarding

data.frames.


One of the fundamental objects that I use is the data frame with a

particular set of columns that I would fill or get information from, and an
entire system would revolve around getting information from or putting
information to such data.frame.


On a different OOP programming language I would be tempted to create a

class that would "wrap-around" that data.frame and create "getters" and
"setters" methods that would return whatever information I need. I started
doing that using S4.


Does anyone have examples of packages that use that approach or any

suggestions?  It just seems to me that a class/object would be a better
idea because it would create a single, hopefully well validated way to
access information and edit the fundamental data.frame object, which would
be helpful if there are several programmers on the team and/or if some of
the data.frame manipulations are not straightforward and are best left
encapsulated in a method of a class, and then have people use that method.
  I would just like to know if there are reasons not do it that way and if
there are any examples of packages that use that approach and that I can
learn from.

You could argue that the entire BioConductor project represents such an
effort. It makes extensive use of S4 methods. I'm not a user so cannot
readily point to examples of S4 functions that have set. and get. methods
for particular sorts of dataframes, but I suspect you

Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread David Winsemius

On Aug 31, 2012, at 7:20 AM, Diana Marcela Martinez Ruiz wrote:

> Hi all,
> 
> I want to know how to perform the test Breslow-Day test for homogeneity of 
> odds ratios (OR) stratified for svytable. This test is obtained with the 
> following code:
> 
> epi.2by2 (dat = daty, method = "case.control" conf.level = 0.95,

missing comma here ...^

>units = 100, homogeneity = "breslow.day", verbose = TRUE)
> 
> where "daty" is the object type table  svytable consider it, but when I run 
> the code
> does not throw the homogeneity test.

You are asked in the Posting guide to copy all errors and warnings when asking 
about unexpected behavior. When I run epi.2y2 on the output of a syvtable 
object I get no errors, but I do get warnings which I think are due to 
non-integer entries in the weighted table. I also get from a svytable() 
usingits first example on the help page an object that is NOT a set of 2 x 2 
tables in an array of the structure as expected by epi.2by2(). The fact that 
epi.2by2() will report numbers with labels for a 2 x 3 table means that its 
error checking is weak.

This is the output of str(dat) from one of the example on epi.2by2's help page:

> str(dat)
 table [1:2, 1:2, 1:3] 41 13 6 53 66 37 25 83 23 37 ...
 - attr(*, "dimnames")=List of 3
  ..$ Exposure: chr [1:2] "+" "-"
  ..$ Disease : chr [1:2] "+" "-"
  ..$ Strata  : chr [1:3] "20-29 yrs" "30-39 yrs" "40+ yrs"

Notice that is is a 2 x 2 x n array. (Caveat:: from here on out I am simply 
reading the help pages and using str() to look at the objects created to get an 
idea regarding success or failure. I am not an experienced user of either 
package.)  I doubt that  what you got from svytable is a 2 x 2 table. As 
another example you can build a 2 x 2 x n table from the built-in dataset: 
UCBAdmissions 

DF <- as.data.frame(UCBAdmissions)
## Now 'DF' is a data frame with a grid of the factors and the counts
## in variable 'Freq'.
dat2 <- xtabs(Freq ~ Gender + Admit+Dept, DF)
epiR::epi.2by2(dat = dat2, method = "case.control", conf.level = 0.95, 
 units = 100, homogeneity = "breslow.day", verbose = TRUE)$OR.homog
#-
  test.statistic dfp.value
1   18.82551  5 0.00207139

Using svydesign and svytable I _think_ this is how one would go about 
constructing a 2 x 2 table:

tbl2<-svydesign(  ~ Gender + Admit+Dept, weights=~Freq, data=DF)
  summary(dclus1)
(tbl2by2 <- svytable(~ Gender + Admit+Dept, tbl2))
 epiR::epi.2by2(dat = tbl, method = "case.control", conf.level = 0.95, 
 units = 100, homogeneity = "breslow.day", verbose = TRUE)$OR.homog
#---
  test.statistic dfp.value
1   18.82551  5 0.00207139

(At least I got internal consistency. I see you copied Thomas Lumley, which is 
a good idea. I'll be happy to get corrected on any point. I'm adding the 
maintainer of epiR to the recipients.)

-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histogram to KDE

2012-08-31 Thread David L Carlson
Using a data.frame x with columns bins and counts:

x <- structure(list(bins = c(3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, 
11.5, 12.5, 13.5, 14.5, 15.5), counts = c(1, 1, 2, 3, 6, 18, 
19, 23, 8, 10, 6, 2, 1)), .Names = c("bins", "counts"), row.names =
4:16,
class = "data.frame")

This will give you a plot of the kde estimate:

xkde <- density(rep(bins, counts), bw="SJ")
plot(xkde)

As for the standard error or the confidence interval, you would probably
need to use bootstrapping. 

--
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of firdaus.janoos
> Sent: Friday, August 31, 2012 9:52 AM
> To: r-help@r-project.org
> Subject: [R] Histogram to KDE
> 
> Hello,
> I wanted to know if there was way to convert a histogram of a data-set
> to a
> kernel density estimate directly in R ?
> 
> Specifically, I have a histogram [bins, counts] of samples {X1 ...
> XN} of a quantized variable X where there is one bin for each level of
> X,
> and I'ld like to directly get a kde estimate of the pdf of X from the
> histogram. Therefore, there is no additional quantization of X in the
> histogram. Most KDE methods in R seem to require the original sample
> set   - and I would like to avoid re-creating the samples from the
> histogram. Is there some quick way of doing this using one of the
> standard
> kde methods in R ?
> 
> Also, a general statistical question - is there some measure of the
> standard error or confidence interval or similar of a KDE of a data-set
> ?
> 
> Thanks,
> -fj
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex \subfloat{} incompatible with sweave/knitr code

2012-08-31 Thread Yihui Xie
OK, then this turns out to be a pure LaTeX problem. Thanks for the
experiments and examples, and they worked very well for me. It seems
subcaption is on the radar of LyX 2.1 now.

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Fri, Aug 31, 2012 at 9:40 AM, Liviu Andronic  wrote:
> On Thu, Aug 30, 2012 at 12:50 AM, Yihui Xie  wrote:
>> Do you know what environments are allowed inside \subfloat{}? The
>>
> No, not really. From what I can tell, \subfloat{} is provided by the
> `subfig' package. Here's what their docs have to say about
> compatibility with verbatim and fancyvrb packages:
> "You cannot place a verbatim environment inside of the \subfloat command 
> because
> the verbatim environment needs to change the character classes before
> the text in the environment is read by TEX. Therefore, if you really
> want to include verbatim text inside a sub-float, you will need to
> place the verbatim text into a box and then feed the box to the
> \subfloat command."
>
> And this is what CTAN says about the 'subfig' package:
> "The package 'subfigure' is now considered obsolete: it was superseded
> by 'subfig', but users may find the more recent 'subcaption' package
> more satisfactory."
>
>
>> graphics example works because it is nothing but a simple
>> \includegraphics{} command. The table example you gave is much more
>> complicated than that.
>>
> I played around with the 'subcaption' package and, although it has
> some side effects  on the figures, it accepts verbatim in its
> 'subtable' environment. See knitr LyX file and PDF:
> http://s000.tinyupload.com/index.php?file_id=00455858466446731442
> http://s000.tinyupload.com/index.php?file_id=07968719976440864387
>
> Maybe it's high time to ask the LyX devels to implement support for
> the 'subcaption' package.
>
> Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply --> data.frame

2012-08-31 Thread Sam Steingold
> * David Winsemius  [2012-08-30 10:14:34 -0700]:
>
>> str( as.data.frame( do.call(rbind, strsplit(c("a,1","b,2","c,3"),  
> ",") ) , stringsAsFactors=FALSE) )
> 'data.frame': 3 obs. of  2 variables:
>  $ V1: chr  "a" "b" "c"
>  $ V2: chr  "1" "2" "3"

do.call/rbind appeared to be TRT. I tried it and got a data frame with
list columns (instead of vectors);

as.data.frame(do.call(rbind,lapply(list.files(...), function (name) {

c(name,list(num1,num2,num3), # num* come from some calculations above
  
strsplit(sub("[^-]*(train|test)[^-]*(-(S)?pca([0-9]*))?-s([0-9]*)c([0-9.]*)\\.score",
   "\\1,\\3,\\4,\\5,\\6",name),",")[[1]])
  })), stringsAsFactors = FALSE)

'data.frame':   2 obs. of  8 variables:
 $ file:List of 2
  ..$ : chr "zzz_test_0531_0630-Spca181-s0c10.score"
  ..$ : chr "zzz_train_0531_0630-Spca181-s0c10.score"
 $ lift.quality:List of 2
  ..$ : num 0.59
  ..$ : num 0.621
 $ proficiency :List of 2
  ..$ : num 0.0472
  ..$ : num 0.0472
 $ set :List of 2
  ..$ : chr "test"
  ..$ : chr "train"
 $ scale   :List of 2
  ..$ : chr "S"
  ..$ : chr "S"
 $ pca :List of 2
  ..$ : chr "181"
  ..$ : chr "181"
 $ s   :List of 2
  ..$ : chr "0"
  ..$ : chr "0"
 $ c   :List of 2
  ..$ : chr "10"
  ..$ : chr "10"

I guess the easiest way is to replace c(...list()...) with c(...) but
that would mean converting num1,num2,num3 to string and back which I
want to avoid for aesthetic reasons. Any better suggestions?

thanks a lot!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://jihadwatch.org http://thereligionofpeace.com
http://palestinefacts.org http://ffii.org http://pmw.org.il
I don't have an attitude problem. You have a perception problem.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class that wraps Data Frame

2012-08-31 Thread Bert Gunter
To add to what David said ...

Of course, there are already S3 "getters" and "setters" methods for data
frames ("[.data.frame" and "[<-.data.frame" )*. These could clearly be
extended -- i.e. the data.frame class could be extended and appropriate S3
methods written. Whether you use S3 or S4 depends on the degree of control,
type checking, reuse etc. you want/need. David's suggestion to look at
Bioconductor is a good one.

Cheers,
Bert
*If you are unfamiliar with the S3 extract methods, consult the R Language
Definition Manual.

On Fri, Aug 31, 2012 at 8:14 AM, David Winsemius wrote:

>
> On Aug 31, 2012, at 5:57 AM, Ramiro Barrantes wrote:
>
> > Hello,
> >
> > I have again a "good practices"/programming theory question regarding
> data.frames.
> >
> > One of the fundamental objects that I use is the data frame with a
> particular set of columns that I would fill or get information from, and an
> entire system would revolve around getting information from or putting
> information to such data.frame.
> >
> > On a different OOP programming language I would be tempted to create a
> class that would "wrap-around" that data.frame and create "getters" and
> "setters" methods that would return whatever information I need. I started
> doing that using S4.
> >
> > Does anyone have examples of packages that use that approach or any
> suggestions?  It just seems to me that a class/object would be a better
> idea because it would create a single, hopefully well validated way to
> access information and edit the fundamental data.frame object, which would
> be helpful if there are several programmers on the team and/or if some of
> the data.frame manipulations are not straightforward and are best left
> encapsulated in a method of a class, and then have people use that method.
>  I would just like to know if there are reasons not do it that way and if
> there are any examples of packages that use that approach and that I can
> learn from.
>
> You could argue that the entire BioConductor project represents such an
> effort. It makes extensive use of S4 methods. I'm not a user so cannot
> readily point to examples of S4 functions that have set. and get. methods
> for particular sorts of dataframes, but I suspect you can pose the same
> question on the BioC mailing list and get a more informed answer.
>
> --
> David Winsemius, MD
> Alameda, CA, USA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing Rmpi with Open MPI

2012-08-31 Thread Dirk Eddelbuettel

On 31 August 2012 at 07:14, Linh Tran wrote:
| I took some time to think about it and don't know that it was the case 
| for me. If the openmpi-bin package was missing, I wouldn't have been 
| able to even use openmpi and during my trouble, I would check the mpi 
| software each time I installed it to verify that it worked (using some 
| simple example commands). Furthermore, I checked the parent folder after 
| each installation and saw non-empty bin folders in them each time. Did 
| the error code somehow point to that? It could have been that the 
| open-mpi bin package somehow installed itself during my troubleshooting.

Hao and I both arrived the conclusion that _loading_ Rmpi fails if
openmpi-bin is not installed. I attributed that to orterun, it is actually
orted which is used.  Which makes complete sense.

So to recap out experience: yes, one may manage to compile and link Rmpi, but
it will not run with orted.
 
| I also don't know whether it was the "enable-shared" option, the 
| "disable-dlopen" option, or both that helped me finally get Rmpi to 
| work. I could go back and uninstall everything to start over and 
| pinpoint it, though I don't think I can go through the frustration 
| again. Regardless, I hope that this information helps other users.

For Hao and myself, Rmpi works with stock Open MPI packages from Debian /
Ubuntu, which appears to contradict Brian Ripley's assertion that
disable-dlopen is a must. 
 
| Thank you very much Dirk and Brian for your help. And thank you Hao for 
| such a great job with the Rmpi package. As always, guys like you are 
| what make R so successful.

Seconded -- the Open MPI folks provide a great tool and it has been a
pleasure to support Rmpi in Debian/Ubuntu as Hao is a very responsive
upstream author who has written a tremendously useful package.

Dirk


| 
| 
| On 08/30/2012 11:02 AM, Dirk Eddelbuettel wrote:
| > On 29 August 2012 at 20:07, Linh Tran wrote:
| > | Thank you for the advice. I tried using the command, and it still
| > | wouldn't load.
| > |
| > | I tried uninstalling all of the MPI interfaces, reinstalled openmpi
| > | using the "--enabled-shared --disable-dlopen" command, and Rmpi was able
| > | to install sucessfully inside R.
| >
| > I just spent some time going over this, along with Prof Hao, the author of
| > Rmpi.  You may simply have been missing the openmpi-bin package which
| > provides tools like orterun -- Open MPI now needs this at startup.  This may
| > get added to configure tests in the future to alert users.
| >
| > With this openmpi-bin package installed, I could easily install Rmpi on
| > Debian and Ubuntu using the distro's Open MPI packages. No need for local
| > Open MPI builds.  Also note this post on the Open MPI list:
| >
| > http://www.open-mpi.org/community/lists/devel/2012/04/10840.php
| >
| > Static builds may lead to loss of resources at runtime as you may have
| > multiple copies running.
| >   
| > | Thank you to Prof Ripley for pointing that out! I think it was shared
| > | and dlopen options that kept preventing me from succeeding.
| >
| > I believe this advise to be incorrect. We have been using dynamically linked
| > builds of Open MPI for several years on Debian with good success. I was
| > involved in an overhaul of Debian's Open MPI packages around 2006/2007; the
| > package has since been very well maintained by others.
| >
| > Dirk
| >   
| > | On 08/29/2012 01:40 PM, Dirk Eddelbuettel wrote:
| > | > On 29 August 2012 at 11:37, Linh Tran wrote:
| > | > | I've spent a few days trying to install Rmpi with no luck. I 
originally
| > | > | tried using mpich, moved on to mpich2, and then to openmpi. I've 
gotten
| > | > | the furthest with openmpi, though am still running into this problem 
and
| > | > | can't figure it out. Can someone help!? Thanks so much in advanced.
| > | > |
| > | > | I'm using an HP Envy laptop with Ubuntu 12.04. Output is below, with 
the
| > | > | error at the bottom.
| > | >
| > | > I suggest you do
| > | >
| > | > sudo apt-get install r-cran-rmpi
| > | >
| > | > which installs the prebuild Rmpi as well as the Open MPI libary it 
depends upon.
| > | >
| > | > Dirk
| > | >
| > |
| >
| 

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Class that wraps Data Frame

2012-08-31 Thread David Winsemius

On Aug 31, 2012, at 5:57 AM, Ramiro Barrantes wrote:

> Hello,
> 
> I have again a "good practices"/programming theory question regarding 
> data.frames.
> 
> One of the fundamental objects that I use is the data frame with a particular 
> set of columns that I would fill or get information from, and an entire 
> system would revolve around getting information from or putting information 
> to such data.frame.
> 
> On a different OOP programming language I would be tempted to create a class 
> that would "wrap-around" that data.frame and create "getters" and "setters" 
> methods that would return whatever information I need. I started doing that 
> using S4.
> 
> Does anyone have examples of packages that use that approach or any 
> suggestions?  It just seems to me that a class/object would be a better idea 
> because it would create a single, hopefully well validated way to access 
> information and edit the fundamental data.frame object, which would be 
> helpful if there are several programmers on the team and/or if some of the 
> data.frame manipulations are not straightforward and are best left 
> encapsulated in a method of a class, and then have people use that method.  I 
> would just like to know if there are reasons not do it that way and if there 
> are any examples of packages that use that approach and that I can learn from.

You could argue that the entire BioConductor project represents such an effort. 
It makes extensive use of S4 methods. I'm not a user so cannot readily point to 
examples of S4 functions that have set. and get. methods for particular sorts 
of dataframes, but I suspect you can pose the same question on the BioC mailing 
list and get a more informed answer.

-- 
David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] differences between survival models between STATA and R

2012-08-31 Thread Göran Broström
On Mon, Jul 9, 2012 at 3:59 PM, Terry Therneau  wrote:

> Without more information, we can only guess what you did, or what you are
> seeing on the page that is "different".
>
> I'll make a random guess though.  There are about 5 ways to paramaterize
> the Weibull distribution.  The standard packages that I know, however, tend
> to use the one found in the Kalbfleisch and Prentice book The Statistical
> Analysis of Failure time Data.  This includes the survreg funciton in R and
> lifereg in SAS, and likely stata tthought I don't know that package.  The
> aftreg function in the eha package uses something different.
>

The main difference is that aftreg estimates AFT factors, i.e., the
(exponential) score of an individual accelerates time with that factor. In
survreg, expected life is multiplied by the corresponding factor (the
inverse of the previous).

>
> About 1/2 the weibull questions I see are due to a change in parameters.
>
> For this reason I have introduced an additional parameter to aftreg,
'param', in the latest version of eha. The default value is 'default'
(everything is as before) and an alternative is 'survreg' (everything is as
in survreg).

Göran

> Terry T.
>
>  begin included message -
>
>
>
>
> Dear Community,
>
> I have been using two types of survival programs to analyse a data set.
>
> The first one is an R function called aftreg. The second one an STATA
> function called streg.
>
> Both of them include the same analyisis with a weibull distribution. Yet,
> results are very different.
>
> Shouldn't the results be the same?
>
> Kind regards,
> J
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Göran Broström

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on numerical object and ifelse function

2012-08-31 Thread Rui Barradas
Hello,

w2 <- if(z > 14) x else y
w2

The difference is that ifelse is vectorized and returns an object of the 
same length as the condition.
Since length(z > 14) == 1, it only returns x[1] (or y[1], were the 
condition FALSE).

Hope this helps,

Rui Barradas
Em 31-08-2012 12:55, Andras Farkas escreveu:
> Dear All,
>   
> this is probably an easy one but I can not get a handle on it:
>   
> x <-c(1,2,3,4,5)
> y <-c(6,7,8,9,10)
> z <-15
> w <-ifelse(z>14,x,y)
>   
> this will give me a value of 1 for w. What I would like to get is the whole 
> string of x, so that w would become a numeric object of 5 characters exactly 
> the same as x.
>   
> Apreciate the help,
>   
> Sincerely,
>   
> Andras
>   [[alternative HTML version deleted]]
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] loading 'RcmdrPlugin.FactoMineR' pachage in R

2012-08-31 Thread Mahbubeh Parsaeian
Hi everybody
I have a question about loading 'RcmdrPlugin.FactoMineR'
pachage in R.
I have installed Rcmdr, FactoMineR and 'RcmdrPlugin.FactoMineR.
when I want to load this package, at first a message inform me I should load
the following list of packages:
FactoMineR, ellipse, lattice, cluster, scatterplot3d, Rcmdr,
tcltk, car, MASS, nnet
I have loaded all above packages and again I load the  'RcmdrPlugin.FactoMineR' 
package.
The output is as follows:
#
local({pkg <- select.list(sort(.packages(all.available =
TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
Attaching package: 'RcmdrPlugin.FactoMineR'
The following object(s) are masked _by_ '.GlobalEnv':
    defmacro,
FactoAFDM, FactoCA, Factocatdes, FactoDMFA, FactoGPA,
    FactoHCPC,
FactoHMFA, FactoMCA, FactoMFA, FactoPCA, Factoprefpls,
    readDataSetFacto
The following object(s) are masked from 'package:Rcdr':
    defmacro
#
Would you please help me to find a good solution for this problem?
Thanks a lot.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dynamic list in aggregate()

2012-08-31 Thread arun
HI,

You can also use these in addition to aggregate:
tapply(dat$x,INDEX=dat[grp],mean)
library(plyr)
ddply(dat,.(dat[grp]$a,dat[grp]$d),summarize,mean(x))
#  dat[grp]$a dat[grp]$d ..1
#1  b  c  0.26781378
#2  b  d -0.01994023
#3  c  e  0.14874737
#4  d  d -0.21051512
#5  e  b  0.18821072

A.K.



- Original Message -
From: Xianming Wei 
To: "r-help@r-project.org" 
Cc: 
Sent: Thursday, August 30, 2012 10:37 PM
Subject: [R] dynamic list in aggregate()

Hi all,

How can I have a dynamic list for different combinations of grouping factors in 
the following example? Thanks.

dat <- data.frame(x=rnorm(100),
                  a=sample(letters[1:5], replace = T),
                  b=sample(letters[1:5], replace = T),
                  c=sample(letters[1:5], replace = T),
                  d=sample(letters[1:5], replace = T))

## define grouping factors
grp <- c('a','d') # or any combination among a, b, c and d
## something like this for the list
lst <- paste('dat$', grp, sep='', collapse =',')
## aggregate on the defined list
aggregate(dat$x, list(lst), mean)

Thanks.

Regards,
Xianming


 Internet e-Mail Disclaimer  

PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted 
with it are intended solely for the use of the addressee(s) and may contain 
information, which is confidential or privileged. If you are not the intended 
recipient, be aware that any disclosure, copying, distribution, or use of the 
contents of this information is prohibited. In such case, you should destroy 
this message and kindly notify the sender by reply e-mail. The views and 
opinions expressed in this e-mail are those of the sender and do not 
necessarily reflect the views of the company. 

VIRUSES:  Email transmission cannot be guaranteed to be secure or error free, 
as information may be intercepted, corrupted, lost, destroyed, arrive late or 
incomplete or contain viruses. This email and any files attached to it have 
been checked with virus detection software before transmission. You should 
nonetheless carry out your own virus check before opening any attachment. BSES 
Limited does not represent or warrant that files attached to this email are 
free from computer viruses or other defects and accepts no liability for any 
loss or damage that may be caused by software viruses

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help on numerical object and ifelse function

2012-08-31 Thread Andras Farkas
Dear All,
 
this is probably an easy one but I can not get a handle on it:
 
x <-c(1,2,3,4,5)
y <-c(6,7,8,9,10)
z <-15
w <-ifelse(z>14,x,y)
 
this will give me a value of 1 for w. What I would like to get is the whole 
string of x, so that w would become a numeric object of 5 characters exactly 
the same as x.
 
Apreciate the help,
 
Sincerely,
 
Andras
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] loading 'RcmdrPlugin.FactoMineR' pachage in R

2012-08-31 Thread Mahbubeh Parsaeian




- Forwarded Message -
From: Mahbubeh Parsaeian 
To: "r-help@r-project.org" 
Cc: 
Sent: Friday, 31 August 2012, 17:18
Subject: loading 'RcmdrPlugin.FactoMineR' pachage in R

Hi everybody
I have a question about loading 'RcmdrPlugin.FactoMineR'
pachage in R.
I have installed Rcmdr, FactoMineR and 'RcmdrPlugin.FactoMineR.
when I want to load this package, at first a message inform me I should load
the following list of packages:
FactoMineR, ellipse, lattice, cluster, scatterplot3d, Rcmdr,
tcltk, car, MASS, nnet
I have loaded all above packages and again I load the  'RcmdrPlugin.FactoMineR' 
package.
The output is as follows:

local({pkg <- select.list(sort(.packages(all.available =
TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
Attaching package: 'RcmdrPlugin.FactoMineR'
The following object(s) are masked _by_ '.GlobalEnv':
    defmacro,
FactoAFDM, FactoCA, Factocatdes, FactoDMFA, FactoGPA,
    FactoHCPC,
FactoHMFA, FactoMCA, FactoMFA, FactoPCA, Factoprefpls,
    readDataSetFacto
The following object(s) are masked from 'package:Rcdr':
    defmacro



Would you please help me to find a good solution for this problem?
Thanks a lot.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram to KDE

2012-08-31 Thread firdaus.janoos
Hello,
I wanted to know if there was way to convert a histogram of a data-set to a
kernel density estimate directly in R ?

Specifically, I have a histogram [bins, counts] of samples {X1 ...
XN} of a quantized variable X where there is one bin for each level of X,
and I'ld like to directly get a kde estimate of the pdf of X from the
histogram. Therefore, there is no additional quantization of X in the
histogram. Most KDE methods in R seem to require the original sample
set   - and I would like to avoid re-creating the samples from the
histogram. Is there some quick way of doing this using one of the standard
kde methods in R ?

Also, a general statistical question - is there some measure of the
standard error or confidence interval or similar of a KDE of a data-set ?

Thanks,
-fj

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] test Breslow-Day for svytable??

2012-08-31 Thread John Sorkin
Suggstion:
You need to send us more information, i.e. the code that genrated daty, or a 
listing of the daty structure, and a copy of the listing
produced by epi.2by2
John

 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Diana 
Marcela Martinez Ruiz  8/31/2012 10:20 AM >>>

Hi all,

I want to know how to perform the test Breslow-Day test for homogeneity of 
odds ratios (OR) stratified for svytable. This test is obtained with the 
following code:

epi.2by2 (dat = daty, method = "case.control" conf.level = 0.95,
units = 100, homogeneity = "breslow.day", verbose = TRUE)

where "daty" is the object type table  svytable consider it, but when I run the 
code
does not throw the homogeneity test.

Thanks.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information.  
Any unauthorized use, disclosure or distribution is prohibited.  If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] latex \subfloat{} incompatible with sweave/knitr code

2012-08-31 Thread Liviu Andronic
On Thu, Aug 30, 2012 at 12:50 AM, Yihui Xie  wrote:
> Do you know what environments are allowed inside \subfloat{}? The
>
No, not really. From what I can tell, \subfloat{} is provided by the
`subfig' package. Here's what their docs have to say about
compatibility with verbatim and fancyvrb packages:
"You cannot place a verbatim environment inside of the \subfloat command because
the verbatim environment needs to change the character classes before
the text in the environment is read by TEX. Therefore, if you really
want to include verbatim text inside a sub-float, you will need to
place the verbatim text into a box and then feed the box to the
\subfloat command."

And this is what CTAN says about the 'subfig' package:
"The package 'subfigure' is now considered obsolete: it was superseded
by 'subfig', but users may find the more recent 'subcaption' package
more satisfactory."


> graphics example works because it is nothing but a simple
> \includegraphics{} command. The table example you gave is much more
> complicated than that.
>
I played around with the 'subcaption' package and, although it has
some side effects  on the figures, it accepts verbatim in its
'subtable' environment. See knitr LyX file and PDF:
http://s000.tinyupload.com/index.php?file_id=00455858466446731442
http://s000.tinyupload.com/index.php?file_id=07968719976440864387

Maybe it's high time to ask the LyX devels to implement support for
the 'subcaption' package.

Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] test Breslow-Day for svytable??

2012-08-31 Thread Diana Marcela Martinez Ruiz

Hi all,

 I want to know how to perform the test Breslow-Day test for homogeneity of 
odds ratios (OR) stratified for svytable. This test is obtained with the 
following code:

 epi.2by2 (dat = daty, method = "case.control" conf.level = 0.95,
units = 100, homogeneity = "breslow.day", verbose = TRUE)

 where "daty" is the object type table  svytable consider it, but when I run 
the code
does not throw the homogeneity test.

 Thanks.  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing Rmpi with Open MPI

2012-08-31 Thread Linh Tran
I took some time to think about it and don't know that it was the case 
for me. If the openmpi-bin package was missing, I wouldn't have been 
able to even use openmpi and during my trouble, I would check the mpi 
software each time I installed it to verify that it worked (using some 
simple example commands). Furthermore, I checked the parent folder after 
each installation and saw non-empty bin folders in them each time. Did 
the error code somehow point to that? It could have been that the 
open-mpi bin package somehow installed itself during my troubleshooting.


I also don't know whether it was the "enable-shared" option, the 
"disable-dlopen" option, or both that helped me finally get Rmpi to 
work. I could go back and uninstall everything to start over and 
pinpoint it, though I don't think I can go through the frustration 
again. Regardless, I hope that this information helps other users.


Thank you very much Dirk and Brian for your help. And thank you Hao for 
such a great job with the Rmpi package. As always, guys like you are 
what make R so successful.



On 08/30/2012 11:02 AM, Dirk Eddelbuettel wrote:

On 29 August 2012 at 20:07, Linh Tran wrote:
| Thank you for the advice. I tried using the command, and it still
| wouldn't load.
|
| I tried uninstalling all of the MPI interfaces, reinstalled openmpi
| using the "--enabled-shared --disable-dlopen" command, and Rmpi was able
| to install sucessfully inside R.

I just spent some time going over this, along with Prof Hao, the author of
Rmpi.  You may simply have been missing the openmpi-bin package which
provides tools like orterun -- Open MPI now needs this at startup.  This may
get added to configure tests in the future to alert users.

With this openmpi-bin package installed, I could easily install Rmpi on
Debian and Ubuntu using the distro's Open MPI packages. No need for local
Open MPI builds.  Also note this post on the Open MPI list:

http://www.open-mpi.org/community/lists/devel/2012/04/10840.php

Static builds may lead to loss of resources at runtime as you may have
multiple copies running.
  
| Thank you to Prof Ripley for pointing that out! I think it was shared

| and dlopen options that kept preventing me from succeeding.

I believe this advise to be incorrect. We have been using dynamically linked
builds of Open MPI for several years on Debian with good success. I was
involved in an overhaul of Debian's Open MPI packages around 2006/2007; the
package has since been very well maintained by others.

Dirk
  
| On 08/29/2012 01:40 PM, Dirk Eddelbuettel wrote:

| > On 29 August 2012 at 11:37, Linh Tran wrote:
| > | I've spent a few days trying to install Rmpi with no luck. I originally
| > | tried using mpich, moved on to mpich2, and then to openmpi. I've gotten
| > | the furthest with openmpi, though am still running into this problem and
| > | can't figure it out. Can someone help!? Thanks so much in advanced.
| > |
| > | I'm using an HP Envy laptop with Ubuntu 12.04. Output is below, with the
| > | error at the bottom.
| >
| > I suggest you do
| >
| > sudo apt-get install r-cran-rmpi
| >
| > which installs the prebuild Rmpi as well as the Open MPI libary it depends 
upon.
| >
| > Dirk
| >
|



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to modify the values of the parameters passing via ...

2012-08-31 Thread Bert Gunter
The OP appears to be confused about argument lists and argument passing
(which has nothing to do with "imputed variables" btw). Here is an
explanation providing a more explicit explanation than RUI's post.

The formal arguments of fun1 are "x", "y",and "z".
The formal arguments of fun2 are "aa" and "..."

So, first of all, the call

fun2(x=1, y=4, z=8)

will fail at the first statement fun1(x=aa,y=5) ##lazy evaluation
because aa is missing and therefore x is missing.  So the OP has posted
incorrectly -- all the claims made therein are false.

Assuming, as RUI did, that the fun2 call was therefore really
fun2(aa=1, y=4, z=8)

then the call will fail as claimed by the OP. This is why:
The formal argument list of fun2 has only aa and  So, as Rui's answer
implied (but did not explain fully), the call
fun2(aa=1, y=4, z=8)
would cause the y and z arguments to become part of the ... list.
Hence the call
 d=fun1(x=aa, y=5, ...)
has the y argument occurring twice (as y=4 and y=5), resulting in the error
message seen.

Given this confusion and also Rui's comment on ";", I would strongly
recommend that the OP spend some time with the R Language Definition manual
(presumably, he/she has already read the Intro to R manual, which always
should be first, even _before_ posting).

Cheers,
Bert


On Thu, Aug 30, 2012 at 3:58 PM, Rui Barradas  wrote:
> Hello,
>
> You could see what is fun1 receiving with this modification meant for
> debugging:
>
> fun1 <-function(x, y, z=10){
> print(match.call())
> x + y + z
> }
>
> Anyway, there is s standard way of dealing with argument '...' when one
> needs to know what was passed.
> Include dots <- list(...) at the beginning of fun2 and then use it when
> necessary:
>
> fun1 <-function(x, y, z=10){x + y + z}
>
> fun2 <-function(aa, ...) {
>   dots <- list(...)
>   a <- fun1(x=aa, y=5)# works well  a=1+5+10=16
>   b <- fun1(x=aa, ...)# works well  b=1+4+8=13
>   y <- 0  # it will not affect values passed using ...
>   c <- fun1(x=aa, ...)# works well but c = 1+4+8=13
>   d <- fun1(x=aa, y=5, dots$z)  # works well  d=1+5+8=14
>   c(a, b, c, d)
> }
>
> fun2(aa=1, y=4, z=8)
> [1] 16 13 13 14
>
>
> Also, to end statements with ';' is a C language tique, R doesn't need it.
> (Nor it hurts.)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 30-08-2012 23:07, Zhiqiu Hu escreveu:
>>
>> Dear Friends,
>>
>> Let's assume there are three parameters that were passed into fun1. In
>> fun1, we need to modify one para but the remains need to be untouched.
And
>> then all parameters were passed into fun2. However, I have failed to
>> achieve it.
>>
>> Please see the following code.
>>
>> ##
>> fun1 <-function(x, y, z=10) {x+y+z;}
>>
>> fun2 <-function(aa, ...) {
>>a=fun1(x=aa, y=5);# works well  a=1+5+10=16
>>b=fun1(x=aa, ...);   # works well  b=1+4+8=13
>>y=0  # it will not affect values passed using
>> ...
>>c=fun1(x=aa, ...)# works well but c = 1+4+8=13
>>
>> #   Some time we may only want to use parts of the inputed variables
>> #   we only want to modify y
>>d=fun1(x=aa, y=5, ...);  #error : formal argument "y" matched by
>> multiple
>> actual arguments
>> #How can I get the result   d=1+5+8=14?
>>
>>c(a, b, c)
>> }
>>
>> fun2(x=1, y=4, z=8)
>> ##
>>
>> I will appreciate it if you would give me any suggestions.
>>
>> Best wishes,
>>
>> Zhiqiu
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reviews for quality control

2012-08-31 Thread Bert Gunter
Completely OT for this list. Post elsewhere.

-- Bert

On Thu, Aug 30, 2012 at 11:13 PM, carol white  wrote:

> Hi,
> It might be a trivial question but I just wonder if you could advise good
> theoretical tutorials, reviews on NGS (different platforms) quality control
>  like nucleotides quality by cycle, nucleotides frequency by cycle, GC
> content and distribution, K-mer frequency by cycle as well as quality
> control after alignment and mapping.
>
>
> Look forward to your reply,
>
> Carol
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting lognormal censored data

2012-08-31 Thread Rich Shepard

On Thu, 30 Aug 2012, Salma Wafi wrote:


I am trying to get some estimator based on lognormal distribution when we
have left,interval, and right censored data.


  Take a look at the NADA package. For left-censored data you can use cenmle
with dist = 'lognormal'. Survival analyses work well for right-censored data
as well as interval-censored data.

  Dennis Helsel's second edition, "Statistics for Censored Environmental Data
Unsing Minitab and R" is the basis for NADA and a valuable resource.

Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Class that wraps Data Frame

2012-08-31 Thread Ramiro Barrantes
Hello,

I have again a "good practices"/programming theory question regarding 
data.frames.

One of the fundamental objects that I use is the data frame with a particular 
set of columns that I would fill or get information from, and an entire system 
would revolve around getting information from or putting information to such 
data.frame.

On a different OOP programming language I would be tempted to create a class 
that would "wrap-around" that data.frame and create "getters" and "setters" 
methods that would return whatever information I need. I started doing that 
using S4.

Does anyone have examples of packages that use that approach or any 
suggestions?  It just seems to me that a class/object would be a better idea 
because it would create a single, hopefully well validated way to access 
information and edit the fundamental data.frame object, which would be helpful 
if there are several programmers on the team and/or if some of the data.frame 
manipulations are not straightforward and are best left encapsulated in a 
method of a class, and then have people use that method.  I would just like to 
know if there are reasons not do it that way and if there are any examples of 
packages that use that approach and that I can learn from.

Thanks in advance,

Ramiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply --> data.frame

2012-08-31 Thread Rui Barradas

Hello,

Em 31-08-2012 11:35, Martin Maechler escreveu:

Hi Rui,

I think when people are asking how to do such basic things in R,
the answer should *not* be to use a (non-base / -recommended)
package...


You're right, that's why I've called it an alternative.



but then, you may have really wanted to advertize plyr,
so never mind.


It' usefull to know that those extra *ply functions are available.

Anyway, my vote goes to Bill Dunlap's read.table solution. I would never 
have thought of that one.


Rui Barradas


Martin


"RB" == Rui Barradas 
 on Thu, 30 Aug 2012 18:57:34 +0100 writes:

 RB> Hello,

 RB> Yet another alternative.


 RB> library(plyr)
 RB> dfr <- ldply(strsplit(c("a,1", "b,2", "c,3"), ","), identity)
 RB> str(dfr)

 RB> #dfr$V2 <- as.numeric(dfr$V2)

 RB> So, if the op was about conversion to df, the answer is yes.

 RB> Rui Barradas

 RB> Em 30-08-2012 18:14, David Winsemius escreveu:
 >>
 >> On Aug 30, 2012, at 9:44 AM, Sam Steingold wrote:
 >>
  * Sam Steingold  [2012-08-30 08:56:17 -0400]:
 
  Is there a way for an apply-type function to return a data frame?
  the closest thing I think of is
 
  foo <- as.data.frame(t(sapply(...)))
  names(foo) <- c()
 >>>
 >>> alas, this has a problem of creating a "homogeneous" data frame, i.e.,
 >>> all the columns are numbers or characters, because the function passed
 >>> to sapply returns c() and
  c(1,2,"a")
 >>> [1] "1" "2" "a"
 >>>
 >>> e.g.,
 >>> as.data.frame(t(sapply(c("a,1","b,2","c,3"),function (n)
 >>> strsplit(n,",")[[1]])))
 >>> V1 V2
 >>> a,1  a  1
 >>> b,2  b  2
 >>> c,3  c  3
 >>>
 >>> 'data.frame':3 obs. of  2 variables:
 >>> $ V1: Factor w/ 3 levels "a","b","c": 1 2 3
 >>> ..- attr(*, "names")= chr  "a,1" "b,2" "c,3"
 >>> $ V2: Factor w/ 3 levels "1","2","3": 1 2 3
 >>> ..- attr(*, "names")= chr  "a,1" "b,2" "c,3"
 >>>
 >>> I wanted the V1 column to be a string, and V2 to be a number.
 >>> (I know stringsAsFactors=FALSE would replace factors with strings, but 
I
 >>> need a string and a number)
 >>>
 >>> I could, of course, do ret$V2 <- as.numeric(ret$V2) but this would mean
 >>> a double conversion: from number to string first (by c()) and then 
back.
 >>
 >> It is starting as a 'string' ('character' in R parlance) so you will
 >> need to coerce it to "numeric" at some point:
 >>
 >> Consider this alternate route:
 >>
 >> > do.call(rbind, strsplit(c("a,1","b,2","c,3"), ",") )
 >> [,1] [,2]
 >> [1,] "a"  "1"
 >> [2,] "b"  "2"
 >> [3,] "c"  "3"
 >> > as.data.frame( do.call(rbind, strsplit(c("a,1","b,2","c,3"), ",") ) )
 >> V1 V2
 >> 1  a  1
 >> 2  b  2
 >> 3  c  3
 >>
 >> > str( as.data.frame( do.call(rbind, strsplit(c("a,1","b,2","c,3"),
 >> ",") ) , stringsAsFactors=FALSE) )
 >> 'data.frame':3 obs. of  2 variables:
 >> $ V1: chr  "a" "b" "c"
 >> $ V2: chr  "1" "2" "3"
 >>

 RB> __
 RB> R-help@r-project.org mailing list
 RB> https://stat.ethz.ch/mailman/listinfo/r-help
 RB> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 RB> and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fitting lognormal censored data

2012-08-31 Thread Berend Hasselman

On 31-08-2012, at 03:54, Salma Wafi wrote:

> Hi ,
> I am trying to get some estimator based on lognormal distribution when we 
> have left,interval, and right censored data. Since, there is now avalible 
> pakage in R can help me in this, I had to write my own code using Newton 
> Raphson method which requires first and second derivative of log likelihood 
> but my problem after runing the code is the estimators were too high. with 
> this email ,I provide simple example for estimation parameters for lognormal 
> when we have only right censored data, and I tried to estimate the parameters 
> using three methods, Surv pakage, optim function, and Newton Raphson 
> calculation. For the Surv pakage and Optim function, I got similar results of 
> estimation values to the true values, but for the Newton Raphson, I got 
> problem with estimation, where it was  too high" overestimation". Is there 
> any one cen help me to know my problem when can be or recommend me to read 
> some good references that can be useful?
> Best regards.
>  Below is the example,  
>  
> cur=curd=cen=cens=array(1,100) 
> Cur=RealCur=realcensoring=realcured=array(1,20)
> ExpCure=Bias=RealCure=array(1,21)
> ##
> Z1=c(rbinom(100,1,0.5))
> Z2=c(rbinom(100,1,0.5))
> dat1<-data.frame(time=rlnorm( 
> 100,2,0.8),Censored=rbinom(100,1,0.9),Cured=rbinom(100,1,.3))
> 
> dat2<-dat1[order(dat1[,1]),]  # order the data #
> for (i in 1:10)
> {
> dat2$Cured[i+90]=0 #Long term survivors/10 individuals#
> dat2$Censored[i+90]=0 
> dat2$time[i+90]=dat2$time[90]
> }
> cens<-c(dat2$Censored) #censored status #
> curd<-c(dat2$Cured)#cured status #
> tim<-c(dat2$time)  #lifetimes #
> X1<-c(Z1)  #X1 Covariate=Age #
> X2<-c(Z2)  #X2 Covariate=Type of treatment(1=chemo,0=radio) #
> 
> L1<-length(cens)   #number of censored#
> for (j in 1:L1)
> {
> if ((cens[j]==1)&(curd[j]==0)) {(cen[j]=1)&(cur[j]=1)}
> 
> else {(cen[j]=cens[j])&(cur[j]=curd[j])}
> } 
> Now, the following is my data:
> 
> ###My Data only with uncensored and right censored  
> 
> data=data.frame(Ti=dat1$time,Censored=cen)
> 
>  Estimation using Surv pakage 
> library(survival)
> survreg(Surv(Ti, Censored)~1, data=data, dist="lognormal")
> ### Seperating the data for simply using optim function and  Newton 
> Raphson 
> 
> dat2<-c(split(data[1:2],data$Censored==1))  # Split the data(cen/uncen) # 
> tun<-c((dat2$'TRUE')$Ti)  # Life times data set for uncensored #
> Stat.uncen<-c((dat2$'TRUE')$Censored) # uncensored Status data set #
> tcen<-c((dat2$'FALSE')$Ti)   # Life times data set for censored #
> Stat.cen<-c((dat2$'FALSE')$Censored)  # censoring Status data set #
> # Estimation using optim function  
> ml= function(par){mu=par[1]
> s=par[2]
> -sum(dlnorm(tun,mu,s,log=TRUE))-sum(1-plnorm(tcen,mu,s))}
> 
> Max=optim(c(0.5,0.5),ml)
> param=c(Max$par)
> 
> ## My Problem is when I try to estimate parameters using newton raphson  ###
> 
> ### intial values ##
> mu=1
> s=1
> # Derivative for mu ##
> F1= function(par){ mu=par[1]
> s=par[2]
> sum((log(tun)-mu)/s^2)+ sum(((1/sqrt(2*pi))*
> exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s  }
> ## Derivative for sigma" s "  ###
> 
> F2= function(par){
> mu=par[1]
> s=par[2]
> sum((log(tun)-mu)^2/s^3 -1/s)+ sum((log(tcen)-mu)*((1/sqrt(2*pi))*
> exp(-((log(tcen)-mu)^2)/2*s^2))/(s^2*(1-plnorm(tcen,mu,s}
> ### Total Function  
> F=function(par){
> F=matrix(0,nrow=2)
> F[1]=F1(par)
> F[2]=F2(par)
> F   }
>   the Jacobian matrix, a 2 x 2 matrix  ###
> d11=d12=d21=d22=array()
> J=function(par){
> j=matrix(0,ncol=2,nrow=2)
> # The format of J is 2 by 2#
> d11=0; d12=0;
> d21=0;d22=0
>  second Derivative for mu ##
> d11 = function(par){
> mu=par[1]
> s=par[2]
> sum(-1/s^2)-sum((1/s^2)*
> (((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/s*((1-plnorm(tcen,mu,s*
> (-(log(tcen)-mu)/s +((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/
> (s*(1-plnorm(tcen,mu,s)}
> 
>  Derivative for mu/s   
> ##
> d12= function(par){
> mu=par[1]
> s=par[2]
> -sum(2*(log(tun)-mu)/s^3)-sum((1/s^2)*(((1/sqrt(2*pi))*
> exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))+((log(tcen)-mu)/s)*
> ((1/sqrt(2*pi))*exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))*
> (((1/sqrt(2*pi))*
> exp(-((log(tcen)-mu)^2)/2*s^2))/(s*(1-plnorm(tcen,mu,s)))-(log(tcen)-mu)/s)))}
> d21=d12
>  second Derivative for s###
> d22=function(par){
> mu=par[1]
> s=par[2]
> sum((-3*(log(tun)-mu)^2/s^4)+1/s^2)-sum((1/s^2)*((log(tcen)-mu)/s)*(((1/sqrt(2*pi))*
> exp(-((log(tcen)-mu)^2)/2*s^2

Re: [R] apply --> data.frame

2012-08-31 Thread PIKAL Petr
Hi

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Sam Steingold
> Sent: Thursday, August 30, 2012 9:17 PM
> To: William Dunlap
> Cc: r-help@r-project.org
> Subject: Re: [R] apply --> data.frame
> 
> > * William Dunlap  [2012-08-30 17:35:08 +]:
> >
> > I don't agree with your analysis of what went wrong with your example
> >> a double conversion: from number to string first (by c()) and then
> back.
> I did not make myself quite clear, sorry.
> I should have written something like
> c(1,2,"a") ==> "1" "2" "a" =[as.numeric]=> 1 2 "a"

But vector cannot contain values of different mode (character, numeric). There 
are other structures capable of this. Converting such vector by as numeric 
inserts NA whenever it encounters non numeric characters.

> as.numeric(c(1,2,"a") )
[1]  1  2 NA
Warning message:
NAs introduced by coercion 

Regards
Petr

> 
> --
> Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X
> 11.0.11103000 http://www.childpsy.net/ http://openvotingconsortium.org
> http://www.memritv.org http://ffii.org http://truepeace.org
> http://palestinefacts.org Those who can't write, write manuals.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.