Re: [R] histogram with bars colored according to a vector of values
On 07/26/2013 12:13 PM, john d wrote: Dear all, Let's say I have the following data.frame: dat-data.frame(x=rnorm(100), y=rnorm(100,2)) and I plot a histogram of variable x, somethink like: hist(dat$x, breaks=-5:5) Now, I'd like to color each bar according to the mean of the cases according to y. For instance, the color of the bar between -2 and -1 should reflect the mean of variable y for the corresponding cases. Any suggestions? John Hi John, Try this: dat$xcut-cut(dat$x,breaks=-5:5) library(plotrix) barcol-color.scale(by(dat$y,dat$xcut,mean),extremes=c(red,blue)) hist(dat$x,breaks=-5:5,col=barcol) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with apply (lapply or sapply not sure)
Hi I would just use for cycle. Something like lll - vector(list, length(InputFiles)) for (i in InputFiles) { everything what you want to do for each file you can store it in a list lll[[i]] - whatever you want to store there } After populating list you can do anything with it. Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Robert Lynch Sent: Friday, July 26, 2013 2:46 AM To: r-help@r-project.org Subject: [R] help with apply (lapply or sapply not sure) I am reading in a bunch of files and then processing them all in the same way. I am sure there as a better way then to copy and past the code for each file. Here is what I've done so far InputFiles- as.character(list.files(~/ISLE/RWork/DataWarehouseMining/byCourse/)) #Path to the Course data files for (i in InputFiles) { # print(head(read.csv(paste(~/ISLE/RWork/DataWarehouseMining/byCourse/, i, sep= print(paste(Reading file ~/ISLE/RWork/DataWarehouseMining/byCourse/, i, sep=,)) assign(i, read.csv(paste(~/ISLE/RWork/DataWarehouseMining/byCourse/, i, sep=))) }#note last file is NOT a course file by the student information. Master-StudentInfoForRobertWUnitAt7A_2.csv #this is the last file CourseFiles -InputFiles[- c(15,16)] # ignore the student info ...7A.csv ...7A_2.csv #for each file I do the following #Bis 101 summary(BigInstBIS101.csv) B101 - BigInstBIS101.csv[-c(3,4,8)] summary(B101) B101$WH_ID - as.factor(B101$WH_ID) B101$SID - as.factor(B101$SID) B101$TERM - as.factor(B101$TERM) B101$CRN - as.factor(B101$CRN) B101$CRN_TRM - as.factor(B101$CRN_TRM) B101$INST_NUM - as.factor(B101$INST_NUM) B101$zGrade - with(B101, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) write.csv(B101,B101.csv, row.names = FALSE) #Bis 2A B2A - BigInstBIS2A.csv[-c(3,4,8)] summary(B2A) B2A$WH_ID - as.factor(B2A$WH_ID) B2A$SID - as.factor(B2A$SID) B2A$TERM - as.factor(B2A$TERM) B2A$CRN - as.factor(B2A$CRN) B2A$CRN_TRM - as.factor(B2A$CRN_TRM) B2A$INST_NUM - as.factor(B2A$INST_NUM) B2A$zGrade - with(B2A, ave(GRADE., list(TERM, INST_NUM), FUN = scale)) write.csv(B2A,B2A.csv, row.names = FALSE) And so on for another 12 courses, however I am changing what I am doing as part of the reading in the file and don't want to replace the code in 14 different places. suggestions? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated measures Cox regression ??coxph??
On 07/26/2013 04:06 AM, John Sorkin wrote: David Thank you for your thoughts. The data I am analyzing do not come from a clinical trial but rather from a cohort study whose aim is to determine risk factors for surgical therapy to treat their joints. John As David explained, there are several ways of approaching this situation. If it is a treatment/control case (which yours isn't), stratification is appropriate. It boils down to a simple sign test where we compare the number of pairs with longest survival of the treated with the number of pairs with the longest survival of the control. Undetermined (due to censoring) pairs are thrown away. Generally, stratification is an alternative to the frailty model, but it has some drawbacks: loss of power (especially with small stratum sizes), and you cannot use covariates that are constant within pairs (personal characteristics in your case). The frailty model comes with stronger assumptions than stratification, but you avoid the drawbacks just mentioned. The clustering method, finally, is for variance correction in the ordinary Cox regression. In your case, would recommend the frailty approach with coxme (while we wait for Terry's verdict!). Göran Sent from my iPhone On Jul 25, 2013, at 9:15 PM, Marc Schwartz marc_schwa...@me.com marc_schwa...@me.com wrote: On Jul 25, 2013, at 4:45 PM, David Winsemius dwinsem...@comcast.net wrote: On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote: On Jul 25, 2013, at 2:11 PM, John Sorkin jsor...@grecc.umaryland.edu wrote: Colleagues, Is there any R package that will allow one to perform a repeated measures Cox Proportional Hazards regression? I don't think coxph is set up to handle this type of problem, but I would be happy to know that I am not correct. I am doing a study of time to hip joint replacement. As each person has two hips, a given person can appear in the dataset twice, once for the left hip and once for the right hip, and I need to account for the correlation of data from a single individual. Thank you, John John, See Terry's 'coxme' package: http://cran.r-project.org/web/packages/coxme/index.html When I looked over the description of coxme, I was concerned it was not really designed with this in mind. Looking at Therneau and Grambsch, I thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the analysis question well. There they compared the use of coxph( ...+cluster(ID),,...) withcoxph( ...+strata(ID),,...). Unfortunately I could not tell for sure which one was being described as superio but I think it was the cluster() alternative. I seem to remember there are discussions in the archives. David, I think that you raise a good point. The example in the book (I had to wait to get home to read it) is potentially different however, in that the subject's eye's were randomized to treatment or control, which would seem to suggest comparable baseline characteristics for each pair of eyes, as well as an active intervention on one side where a difference in treatment effect between each eye is being analyzed. It is not clear from John's description above if there is one hip that will be treated versus one as a control and whether the extent of disease at baseline is similar in each pair of hips. Presumably the timing of hip replacements will be staggered at some level, even if there is comparable disease, simply due to post-op recovery time and surgical risk. In cases where the disease between each hip is materially different, that would be another factor to consider, however I would defer to orthopaedic physicians/surgeons from a subject matter expertise consideration. It is possible that the bilateral hip replacement data might be more of a parallel to bilateral breast cancer data, if each breast were to be tracked separately. I have cc'd Terry here, hoping that he might jump in and offer some insights into the pros/cons of using coxme versus coxph with either a cluster or strata based approach, or perhaps even a frailty based approach as in 9.4.1 in the book. Regards, Marc -- David. You also might find the following of interest: http://bjo.bmj.com/content/71/9/645.full.pdf http://www.ncbi.nlm.nih.gov/pubmed/6885 http://www.ncbi.nlm.nih.gov/pubmed/22078901 Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for
Re: [R] Change values in a dateframe-Speed TEST
Hi Michel, Sorry, I misunderstood your question. You could also try: library(plyr) df1New-droplevels(ddply(TEST,.(Matricule),function(x) {x[,c(Nom,Prenom)]- x[1,c(Nom,Prenom)];x})) df2New-droplevels(ddply(TEST,.(Matricule),function(x) {x[,c(Nom,Prenom)]- x[nrow(x),c(Nom,Prenom)];x})) identical(df1,df1New) #[1] TRUE identical(df2,df2New) #[1] TRUE #or library(data.table) dt1- data.table(TEST) dt2-dt1 dt1[,Nom:=head(Nom,1),by=Matricule] dt1[,Prenom:=head(Prenom,1),by=Matricule] identical(df1,droplevels(as.data.frame(dt1))) #[1] TRUE dt2[,Nom:=tail(Nom,1),by=Matricule] dt2[,Prenom:=tail(Prenom,1),by=Matricule] identical(df2,droplevels(as.data.frame(dt2))) #[1] TRUE #If you are considering speed, then ?data.table() would be useful set.seed(28) dfTest- as.data.frame(matrix(sample(1:50,1e6*5,replace=TRUE),ncol=5)) system.time({res1-do.call(rbind,lapply(split(dfTest,dfTest$V1), FUN=function(x) {x[,c(V2,V3)] - x[1,c(V2,V3)];x}))}) # user system elapsed # 4.452 0.036 4.499 row.names(res1)-1:nrow(res1) dtNew- data.table(dfTest) system.time({dtNew[,V2:=head(V2,1),by=V1] dtNew[,V3:=head(V3,1),by=V1] dtNew-dtNew[order(V1)] #here, the dataset was not pre-sorted, so just to keep the same order as the above solution }) # user system elapsed # 0.132 0.000 0.133 identical(res1,as.data.frame(dtNew)) #[1] TRUE A.K. - Original Message - From: Arnaud Michel michel.arn...@cirad.fr To: Berend Hasselman b...@xs4all.nl Cc: R help r-help@r-project.org Sent: Thursday, July 25, 2013 3:59 AM Subject: Re: [R] Change values in a dateframe-Speed TEST Le 25/07/2013 08:50, Berend Hasselman a écrit : On 25-07-2013, at 08:35, Arnaud Michel michel.arn...@cirad.fr wrote: But I just noticed that the two solutions are not comparable : the change concern only Nom and Prenom (solution Berend) and not also Sexe or Date.de.naissance orother variables (solution Arun) that can changed. But my question was badly put. Indeed:-) But that can be remedied with (small correction w.r.t. initial solution: drop=TRUE removed; not relevant here) r1 - droplevels(do.call(rbind,lapply(split(TEST,TEST$Matricule), FUN=function(x) {x[,1:ncol(x)] - x[1,1:ncol(x)];x}))) and r2 - droplevels(do.call(rbind,lapply(split(TEST,TEST$Matricule), FUN=function(x) {x[,1:ncol(x)] - x[nrow(x),1:ncol(x)];x}))) Thank you but I keep {x[,c(Nom,Prénom)] - x[nrow(x),c(Nom,Prénom)];x} because in the dataframe there are other variables that I do not want to change. I want change only Nom and Prénom PS : ?w.r.t. Michel Less elegant than alternative with ave Berend Michel Le 25/07/2013 08:06, Arnaud Michel a écrit : Hi For a dataframe with name PaysContrat1 and with nrow(PaysContrat1) [1] 52366 the test of system.time is : system.time(droplevels(do.call(rbind,lapply(split(PaysContrat1,PaysContrat1$Matricule), FUN=function(x) {x[,c(Nom,Prénom)] - x[nrow(x),c(Nom,Prénom),drop=TRUE];x} user system elapsed 14.03 0.00 14.04 system.time(droplevels(PaysContrat1[with(PaysContrat1,ave(seq_along(Matricule),Matricule,FUN=min)) ,] )) user system elapsed 0.2 0.0 0.2 Michel Le 24/07/2013 15:29, arun a écrit : Hi Michel, You could try: df1New-droplevels(TEST[with(TEST,ave(seq_along(Matricule),Matricule,FUN=min)),]) row.names(df1New)-1:nrow(df1New) df2New-droplevels(TEST[with(TEST,ave(seq_along(Matricule),Matricule,FUN=max)),]) row.names(df2New)-1:nrow(df2New) identical(df1New,df1) #[1] TRUE identical(df2New,df2) #[1] TRUE A.K. - Original Message - From: Arnaud Michel michel.arn...@cirad.fr To: R help r-help@r-project.org Cc: Sent: Wednesday, July 24, 2013 2:39 AM Subject: [R] Change values in a dateframe Hello I have the following problem : The dataframe TEST has multiple lines for a same person because : there are differents values of Nom or differents values of Prenom but the values of Matricule or Sexe or Date.de.naissance are the same. TEST - structure(list(Matricule = c(66L, 67L, 67L, 68L, 89L, 90L, 90L, 91L, 108L, 108L, 108L), Nom = structure(c(1L, 2L, 2L, 4L, 8L, 5L, 6L, 9L, 3L, 3L, 7L), .Label = c(CHICHE, GEOF, GUTIER, JACQUE, LANGUE, LANGUE-LOPEZ, RIVIER, TRU, VINCENT ), class = factor), Prenom = structure(c(8L, 3L, 4L, 5L, 1L, 2L, 2L, 9L, 6L, 7L, 7L), .Label = c(Edgar, Elodie, Jeanine, Jeannine, Michel, Michele, Michèle, Michelle, Victor ), class = factor), Sexe = structure(c(1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 1L), .Label = c(Féminin, Masculin), class = factor), Date.de.naissance = structure(c(4L, 2L, 2L, 7L, 6L, 5L, 5L, 1L, 3L, 3L, 3L), .Label = c(03/09/1940, 04/03/1946, 07/12/1947, 18/11/1945, 27/09/1947, 29/12/1936, 30/03/1935), class = factor)), .Names = c(Matricule, Nom, Prenom, Sexe, Date.de.naissance), class = data.frame, row.names = c(NA, -11L)) I would want to make homogeneous the information
Re: [R] Function, that assigns two vectors to each other
HI, You could try: my.data- structure... pe-round(pe,0) peMax-as.data.frame(sapply(paste0(a,1:4),function(x) {x1-my.data[,x]; unsplit(lapply(split(x1,x1),function(y) {x2-row.names(pe)[pe[,x]%in% y]; x3-x2[which.max(as.numeric(gsub(\\D+,,x2)))];rep(x3,length(y))}),x1)}),stringsAsFactors=FALSE) names(peMax)- paste0(pe,1:4) my.dataNew- cbind(my.data,peMax) my.dataNew # pa a1 a2 a3 a4 pe1 pe2 pe3 pe4 #1 1 84 113 96 76 12% 89% 24% 0% #2 2 108 101 108 106 90% 45% 80% 72% #3 3 113 99 110 94 100% 36% 89% 25% #4 4 99 108 99 124 78% 61% 49% 100% #5 5 98 122 118 91 72% 100% 100% 12% #6 6 88 92 100 103 27% 12% 58% 46% #7 7 90 90 90 107 45% 2% 12% 78% #8 8 89 110 89 106 38% 79% 5% 72% #9 9 95 109 102 113 57% 72% 67% 89% #10 10 77 95 99 96 0% 23% 49% 34% If you don't want the % attached to the number my.dataNew[,6:9]-lapply(my.dataNew[6:9],function(x) as.numeric(gsub(\\D+,,x))) my.dataNew # pa a1 a2 a3 a4 pe1 pe2 pe3 pe4 #1 1 84 113 96 76 12 89 24 0 #2 2 108 101 108 106 90 45 80 72 #3 3 113 99 110 94 100 36 89 25 #4 4 99 108 99 124 78 61 49 100 #5 5 98 122 118 91 72 100 100 12 #6 6 88 92 100 103 27 12 58 46 #7 7 90 90 90 107 45 2 12 78 #8 8 89 110 89 106 38 79 5 72 #9 9 95 109 102 113 57 72 67 89 #10 10 77 95 99 96 0 23 49 34 A.K. - Original Message - From: Anne-Marie B. Gallrein gallr...@psychologie.tu-dresden.de To: John Kane jrkrid...@inbox.com Cc: r-help@r-project.org Sent: Thursday, July 25, 2013 7:28 AM Subject: Re: [R] Function, that assigns two vectors to each other Hello guys, I created an example data set: structure(list(pa = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), a1 = c(84, 108, 113, 99, 98, 88, 90, 89, 95, 77), a2 = c(113, 101, 99, 108, 122, 92, 90, 110, 109, 95), a3 = c(96, 108, 110, 99, 118, 100, 90, 89, 102, 99), a4 = c(76, 106, 94, 124, 91, 103, 107, 106, 113, 96)), .Names = c(pa, a1, a2, a3, a4), row.names = c(NA, -10L), class = data.frame) So the data frame contains the numbers of my participants (1 to 10) and the score, they hit on 4 tasks (a1 to a4). I wrote this function, to use on the data: pe-apply(X=my.data[,c(a1,a2,a3,a4)], MARGIN=2, FUN=quantile, probs=seq(0,1,by=.01), na.rm=TRUE) round(pe,0) It computes the percentiles of each task. So when using this function I know, that e.g. a person who got 77 points on task 1 (a1) has a percentile of 0%. If a person scores 88 points then he/she got the percentiles 21% to 27%, so 27% got the same amount of points or less. In comparison in task 4 (a4) a person reaching 77 points has a percentile of 1%. Now I want to add 4 columns to my.data (pe1 to pe4). The final data frame my.data shall have 10 rows and 9 columns These columns (pe1 to pe4) shall show the maximum percentile someone reached according to his points for each task. So for the person who reached 77 points in a1 the respective pe1 would be 0. For all the people who reached 88 points in a1 the respective pe1 would be 27. For all the people who reached 77 points in a4 the respective pe1 would be 1. The final data frame my.data shall have 10 rows and 9 columns. So for the first participant (pa=1), the pe's would be a1=84 -- pe=12; a2=113 -- pe=89, a3=96 -- pe=24, a4=76 -- pe=0 I hope, that is clearer than before :) Thanks a lot, Anne Am 24.07.2013 14:47, schrieb John Kane: Welcome to R-help it is a bit hard to see exactly what you want without data. Rest of the explanation looks good though it appears you may have sent this in HTML and the list asks for text. It strips out the html and we lose any html format. Can I suggest reading these https://github.com/hadley/devtools/wiki/Reproducibility http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and then getting back to use with some data. The best way to provide data , as is described in the above links is to use dput() (type ?dput for help ) and then just copy and paste the results into the mail. John Kane Kingston ON Canada -Original Message- From: gallr...@psychologie.tu-dresden.de Sent: Wed, 24 Jul 2013 12:25:35 +0200 To: r-help@r-project.org Subject: [R] Function, that assigns two vectors to each other Hey guys, In my data setv (KD) I have 4 columns (Punkte.AG1,Punkte.AG2,Punkte.AG3,Punkte.WI) I'm interested in. These columns contain the participants' scores of a specific task. I computed the percentiles of the columns using this code: pe-apply(X=KD[,c(Punkte.AG1,Punkte.AG2,Punkte.AG3,Punkte.WI)], MARGIN=2, FUN=quantile, probs=seq(0,1,by=.01), na.rm=TRUE) round(pe,0) This is the output (to the 20^th percentile): pe Punkte.AG1 Punkte.AG2 Punkte.AG3 Punkte.WI 0%6319 1%74311 2%86312 3%87412 4%97512 5%98512 6%108512 7%108512
Re: [R] How to split two levels several times?
May be this also helps: XXX: dataset rl-rle(as.character(XXX$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode1 (rl$lengths[i]%/%31)) rep(3,rl$lengths[i]%/%3) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX[seq(vec2[i],max(x1)),]}) res #[[1]] # electrode length #1 electrode1 5.7 #2 electrode1 6.3 #3 electrode1 6.2 #4 electrode2 11.4 #5 electrode2 9.7 # #[[2]] # electrode length #6 electrode3 14.2 #7 electrode3 14.8 #8 electrode3 12.6 #9 electrode2 11.4 #10 electrode2 9.7 # #[[3]] # electrode length #11 electrode4 17.0 #12 electrode4 16.3 #13 electrode4 17.8 #14 electrode4 18.3 #15 electrode4 16.9 #16 electrode4 18.5 #17 electrode1 5.7 #18 electrode1 6.3 #19 electrode1 6.2 #[[4]] # electrode length #20 electrode1 5.7 #21 electrode1 6.3 #22 electrode1 6.2 #23 electrode3 14.2 #24 electrode3 14.8 #25 electrode3 12.6 Also, tested in cases like below: XXX1- structure(list(electrode = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L), .Label = c(electrode1, electrode2, electrode3, electrode4), class = factor), length = c(5.7, 6.3, 6.2, 11.4, 9.7, 14.2, 14.8, 12.6, 11.4, 9.7, 17, 16.3, 17.8, 18.3, 16.9, 18.5, 5.7, 6.3, 6.2, 7.7, 7.3, 6.2, 6.7, 6.8, 6.9, 5.7, 6.3, 6.2, 14.2, 14.8, 12.6)), .Names = c(electrode, length), class = data.frame, row.names = c(NA, -31L)) XXX2-structure(list(electrode = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L), .Label = c(electrode1, electrode2, electrode3, electrode4), class = factor), length = c(5.7, 6.3, 6.2, 11.4, 9.7, 14.2, 14.8, 12.6, 11.4, 9.7, 17, 16.3, 17.8, 18.3, 16.9, 18.5, 5.7, 6.3, 6.2, 7.7, 7.3, 6.2, 6.7, 6.8, 6.9, 14.2, 14.8, 12.6)), .Names = c(electrode, length), class = data.frame, row.names = c(NA, -28L)) rl-rle(as.character(XXX1$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode1 (rl$lengths[i]%/%31)) rep(3,rl$lengths[i]%/%3) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res1- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX1[seq(vec2[i],max(x1)),]}) rl-rle(as.character(XXX2$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode1 (rl$lengths[i]%/%31)) rep(3,rl$lengths[i]%/%3) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res2- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX2[seq(vec2[i],max(x1)),]}) Didn't test it extensively. So, it may fail in other situations. A.K. - Original Message - From: Rui Barradas ruipbarra...@sapo.pt To: dennis1...@gmx.net Cc: r-help@r-project.org Sent: Thursday, July 25, 2013 2:53 PM Subject: Re: [R] How to split two levels several times? Hello, I think the following does what you want. (I don't know if it makes much sense but it works.) lens - rle(as.character(XXX$electrode))$lengths m - length(lens) %/% 2 idx - rep(1:m, sapply(1:m, function(.m) sum(lens[(2*.m - 1):(2*.m)]))) if(length(lens) %% 2 != 0){ idx - c(idx, rep(m + 1, lens[length(lens)])) sp_idx - split(idx, idx) n - length(sp_idx[[m]]) if(n %/% 2 length(sp_idx[[m + 1]])) sp_idx[[m]][(n %/% 2 + 1):n] - sp_idx[[m + 1]][1] else sp_idx[[m]][(n - length(sp_idx[[m + 1]]) + 1):n] - sp_idx[[m + 1]][1] idx - unlist(sp_idx) } sp - split(XXX, idx) sp Rui Barradas Em 25-07-2013 11:40, dennis1...@gmx.net escreveu: Hi Rui once more thank you for your help. But the code does so far not solve the problem because it still treats rows 17-22 (repeated appearance of electrode1) as one single level. However as can be seen by rows 1-3 (or rows 17-19 and rows 20-22) and the order of the length variable (row 1 = 5.7, row 2 = 6.3, row 3 = 6.2) electrode1 consists only of 3 rows. Maybe that was not made absolutely clear by me. As described in my mail before if by chance (or systematically) it happens to be that electrode1 appears right after each other in the table then the code should split it “half way”. So idx should not return [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 but instead 6 times number 4 at the end [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 Do you have any solution? Gesendet: Mittwoch, 24. Juli 2013 um 23:47 Uhr Von: Rui Barradas ruipbarra...@sapo.pt An: dennis1...@gmx.net Cc:
[R] How do I get a a p-value for the output of an lme model with lme4?
Hi there, I just started using lme4 and I have a question about obtaining p-values. I'm trying to get p-values for the output of a linear mixed-effects model. In my experiment I have a 2 by 2 within subjects design, fully crossing two factors, Gram and Number. This is the command I used to run the model: m - lmer(RT ~ Gram*Number + (1|Subject) + (0+Gram+Number|Subject) + (1|Item_number),data= data) If I understand this code, I am getting coefficients for the two fixed effects (Gram and Number) and their interaction, and I am fitting a model that has by-subject intercepts and slopes for the two fixed effects, and a by-item intercept for them. Following Barr et al. (2013), I thought that this code gets rid of the correlation parameters. I don't want estimate the correlations because I want to get the p-values using pvals.fnc (), and I read that this function doesn't work if there are correlations in the model. The command seems to work: m Linear mixed model fit by REML Formula: RT ~ Gram * Number + (1 | Subject) + (0 + Gram + Number | Subject) + (1 |Item_number) Data: mverb[mverb$Region == 06v1, ] AIC BIC logLik deviance REMLdev 20134 20204 -1005420138 20108 Random effects: Groups NameVariance Std.Dev. Corr Item_number (Intercept) 273.508 16.5381 Subject Gramgram0.000 0. Gramungram 3717.213 60.9689NaN Number159.361 7.7046NaN -1.000 Subject (Intercept) 14075.240 118.6391 Residual35758.311 189.0987 Number of obs: 1502, groups: Item_number, 48; Subject, 32 Fixed effects: Estimate Std. Error t value (Intercept)402.520 22.321 18.033 Gram1 -57.788 14.545 -3.973 Number1 -4.191 9.858 -0.425 Gram1:Number1 15.693 19.527 0.804 Correlation of Fixed Effects: (Intr) Gram1 Numbr1 Gram1 -0.181 Number1 -0.034 0.104 Gram1:Nmbr1 0.000 -0.002 -0.011 However, when I try to calculate the p-values I still get an error message: pvals.fnc(m, withMCMC=T)$fixed Error in pvals.fnc(m, withMCMC = T) : MCMC sampling is not implemented in recent versions of lme4 for models with random correlation parameters Am I making a mistake when I specify my model? Shouldn't pvals.fnc() work if I removed the correlations? Thanks for your help! --Sol __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Duplicated function with conditional statement
Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample',' sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 11 10 4 buy 4 0 can somebody help me? any help will be appreciated. I am new in this mailing list, so forgive me in advance, If I did not ask the question appropriately. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] duplicated() with conditional statement
Hi, You may try this (didn't get time to test this extensively) indx-which(tt$response!=buy) tt$newcolumn-0 tt[unlist(lapply(seq_along(indx),function(i) {x1-if(indx[i]==nrow(tt)) indx[i] else seq(indx[i]+1,indx[i+1]-1);x2-rbind(tt[indx[1:i],],tt[x1,]); if(any(x2$response==sample)) row.names(x2[duplicated(x2$product),])})),newcolumn]-1 tt # subj response product newcolumn #1 1 sample 1 0 #2 1 sample 2 0 #3 1 buy 3 0 #4 2 sample 2 0 #5 2 buy 2 0 #6 3 sample 3 1 #7 3 sample 2 1 #8 3 buy 1 0 #9 4 sample 1 1 #10 4 buy 4 0 A.K. From: vanessa van der vaart vanessa.va...@gmail.com To: smartpink...@yahoo.com Sent: Thursday, July 25, 2013 3:49 PM Subject: Re: duplicated() with conditional statement thank you for the reply. It based on entire data set. subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 . 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 1 1 10 4 buy 4 0 I am sorry i didnt question it very clearly, let me change the conditional statement, I hope you can understand. i will explain by example as you can see, almost every number is duplicated, but only in row 6th,7th,and 9th the value on column is 1. on row4th, the value is duplicated( 2 already occurred on 2nd row),but since the value is considered as duplicated only if the value is duplicated where the response is 'buy' than the value on column, on row4th still zero. On row 6th, where the value product column is 3. 3 is already occurred in 3rd row where the value on response is 'buy', so the value on column should be 1 I hope it can understand the conditional statement. On Thu, Jul 25, 2013 at 8:25 PM, smartpink...@yahoo.com wrote: Hi, May be I understand it incorrectly. Your new column value doesn't correspond to your conditional statement. Also, is this duplication based on entire dataset or within subj. quote author='misseb' Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample','sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy 4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 1 1 10 4 buy 4 0 can somebody help me? any help will be appreciated. I am new in this mailing list, so forgive me in advance, If I did not ask the question appropriately. /quote Quoted from: http://r.789695.n4.nabble.com/duplicated-with-conditional-statement-tp4672342.html _ Sent from http://r.789695.n4.nabble.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise comparison between columns, logic
HI, Not sure about what your expected output would be. Also 'CEBPA' was not present in the Data.txt. gset- read.table(Names.txt,header=TRUE,stringsAsFactors=FALSE) temp1- read.table(Data.txt,header=TRUE,stringsAsFactors=FALSE) lst1-split(temp1,temp1$Names) mat1-combn(gset[-1,1],2) #removed CEBPA library(plyr) lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=patient_id,type=inner);x1[patient_id] }) names(lst2)-apply(mat1,2,paste,collapse=_) do.call(rbind,lst2) # patient_id #DNMT3A_FLT3.1 LAML-AB-2811-TB #common ids between DNMT3A and FLT3 #DNMT3A_FLT3.2 LAML-AB-2816-TB #DNMT3A_FLT3.3 LAML-AB-2818-TB #DNMT3A_IDH1.1 LAML-AB-2802-TB#common ids between DNMT3A and IDH1. If you wanted it as separate dataframes, use `lst2`. #DNMT3A_IDH1.2 LAML-AB-2822-TB #DNMT3A_NPM1.1 LAML-AB-2802-TB #DNMT3A_NPM1.2 LAML-AB-2809-TB #DNMT3A_NPM1.3 LAML-AB-2811-TB #DNMT3A_NPM1.4 LAML-AB-2816-TB #DNMT3A_NRAS LAML-AB-2816-TB #FLT3_NPM1.1 LAML-AB-2811-TB #FLT3_NPM1.2 LAML-AB-2812-TB #FLT3_NPM1.3 LAML-AB-2816-TB #FLT3_NRAS LAML-AB-2816-TB #IDH1_NPM1 LAML-AB-2802-TB #NPM1_NRAS LAML-AB-2816-TB A.K. Hello R experts, I am trying to solve the following logic. I have two input files. The first file (Names.txt) that has two columns: Column1 Column2 CEBPA CEBPA DNMT3A DNMT3A FLT3FLT3 IDH1IDH1 NPM1NPM1 NRASNRAS and the second input file Data.txt has two columns Names, patient_id. Namepatient_id DNMT3A LAML-AB-2802-TB DNMT3A LAML-AB-2809-TB DNMT3A LAML-AB-2811-TB DNMT3A LAML-AB-2816-TB DNMT3A LAML-AB-2818-TB DNMT3A LAML-AB-2822-TB DNMT3A LAML-AB-2824-TB FLT3LAML-AB-2811-TB FLT3LAML-AB-2812-TB FLT3LAML-AB-2814-TB FLT3LAML-AB-2816-TB FLT3LAML-AB-2818-TB FLT3LAML-AB-2825-TB FLT3LAML-AB-2830-TB FLT3LAML-AB-2834-TB IDH1LAML-AB-2802-TB IDH1LAML-AB-2821-TB What I am attempting to do is for each name in first column of names.txt, I do a pairwise comparison with the other names in the second column based on which patient ids are common. To explain in detail: As an example: I extract patient_ids for CEBPA and DNMT3A and see which are common, then I do the same for CEBPA and FLT3 and so on for CEBPA and the next name in column 2. So far the script I have written only does the comparison with the first name in the list. So essentially with itself. I am not sure why this logic is not working for all the names in column 2 for a single name in column 1. Below is my script: gset-read.table(Names.txt,header=F,na.strings = ., as.is=T) # reading in the genes temp-read.table(Data.txt,header=T,sep=\t) # all-length(unique(temp$fpatient_id)) final-c() both.ab - list() both - list() temp.b - matrix() for(i in 1:nrow(gset)) # Loop for genes in the first column { temp2-temp[which(temp$Column1 %in% gset[i,]),] num.mut-length(unique(temp2$patient_id)) temp.a -temp[which(temp$Column1 == gset[i,1]),] for(j in 1:(nrow(gset)) # Loop for genes in the second column { temp.b -temp[which(temp$Column2 == gset[j,2]),] # See which patient_ids of temp.a are in temp.b both.ab[[i]]-temp.a[which(temp.a$patient_id %in% temp.b$patient_id),] } both[[i]]-both.ab[[i]] num.both-length(unique(both[[i]]$patient_id)) line-c(paste(gset[i, which(!(is.na(gset[i,]))) ],collapse=/), num.mut, all, num.mut/all, num.both) final-rbind(final,line) } Names.txtData.txtScript.txt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] duplicated() with conditional statement
Hi, Sorry,`indx` should be: indx-which(tt$response==buy) #I changed indx but forgot about it tt$newcolumn-0 tt[unlist(lapply(seq_along(indx),function(i) {x1-if(indx[i]==nrow(tt)) indx[i] else seq(indx[i]+1,indx[i+1]-1);x2-rbind(tt[indx[1:i],],tt[x1,]); if(any(x2$response==sample)) row.names(x2[duplicated(x2$product),])})),newcolumn]-1 tt subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 1 1 10 4 buy 4 0 A.K. From: vanessa van der vaart vanessa.va...@gmail.com To: arun smartpink...@yahoo.com Sent: Thursday, July 25, 2013 8:55 PM Subject: Re: duplicated() with conditional statement hii thanks for the code, I tried the code but i got the error message, Error in from:to : NA/NaN argument I dont know what doest it mean, and I dont know how to fix it.. could you help please.. thank you very much in advance On Thu, Jul 25, 2013 at 10:52 PM, arun smartpink...@yahoo.com wrote: Hi, You may try this (didn't get time to test this extensively) indx-which(tt$response!=buy) tt$newcolumn-0 tt[unlist(lapply(seq_along(indx),function(i) {x1-if(indx[i]==nrow(tt)) indx[i] else seq(indx[i]+1,indx[i+1]-1);x2-rbind(tt[indx[1:i],],tt[x1,]); if(any(x2$response==sample)) row.names(x2[duplicated(x2$product),])})),newcolumn]-1 tt # subj response product newcolumn #1 1 sample 1 0 #2 1 sample 2 0 #3 1 buy 3 0 #4 2 sample 2 0 #5 2 buy 2 0 #6 3 sample 3 1 #7 3 sample 2 1 #8 3 buy 1 0 #9 4 sample 1 1 #10 4 buy 4 0 A.K. From: vanessa van der vaart vanessa.va...@gmail.com To: smartpink...@yahoo.com Sent: Thursday, July 25, 2013 3:49 PM Subject: Re: duplicated() with conditional statement thank you for the reply. It based on entire data set. subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 . 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 1 1 10 4 buy 4 0 I am sorry i didnt question it very clearly, let me change the conditional statement, I hope you can understand. i will explain by example as you can see, almost every number is duplicated, but only in row 6th,7th,and 9th the value on column is 1. on row4th, the value is duplicated( 2 already occurred on 2nd row),but since the value is considered as duplicated only if the value is duplicated where the response is 'buy' than the value on column, on row4th still zero. On row 6th, where the value product column is 3. 3 is already occurred in 3rd row where the value on response is 'buy', so the value on column should be 1 I hope it can understand the conditional statement. On Thu, Jul 25, 2013 at 8:25 PM, smartpink...@yahoo.com wrote: Hi, May be I understand it incorrectly. Your new column value doesn't correspond to your conditional statement. Also, is this duplication based on entire dataset or within subj. quote author='misseb' Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample','sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy 4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3
[R] help on carrying forward several vectors of rownames
Dear reader, I'm trying to use multiple vectors as rownames for the read.table function. I tried to insert these vectors as a list, e.g. row.names=c(1,4), from the excel file. I tried other ways, as if the argument only took continuous tab separated columns, e.g. row.names=c(1:4). But, this also did not work. Is there a solution to this problem? Regards, -- Franklin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Maintaining data order in factanal with missing data
Hi, I'm new to R, so sorry if this is a simple answer. I'm currently trying to collapse some ordinal variables into a composite; the program ideally should take a data frame as input, perform a factor analysis, compute factor scores, sds, etc., and return the rescaled scores and loadings. The difficulty I'm having is that my data set contains a number of NA, which I am excluding from the analysis using complete.cases(), and thus the incomplete cases are skipped. These functions are for a longitudinal data set with repeated waves of data, so the final rescaled scores from each wave need to be saved as variables grouped by a unique ID (DMID). The functions I'm trying to implement are as follows: weighted.sd-function(x,w){ sum.w-sum(w) sum.w2-sum(w^2) mean.w-sum(x*w)/sum(w) x.sd.w-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2)) return(x.sd.w) } re.scale-function(f.scores, raw.data, loadings){ fz.scores-(f.scores+mean(f.scores))/(sd(f.scores)) means-apply(raw.data,1,weighted.mean,w=loadings) sds-apply(raw.data,1,weighted.sd,w=loadings) grand.mean-mean(means) grand.sd-mean(sds) final.scores-((fz.scores*grand.sd)+grand.mean) return(final.scores) } get.scores-function(data){ fact-factanal(data[complete.cases(data),],factors=1,scores=regression) f.scores-fact$scores[,1] f.loads-fact$loadings[,1] rescaled.scores-re.scale(f.scores, data[complete.cases(data),], f.loads) output.list-list(rescaled.scores, f.loads) names(output.list)-c(rescaled.scores, factor.loadings) return(output.list) } init.dfs-function(){ ab.1.df-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1)) ab.2.df-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1)) ab.3.df-subset(ab.df,,select=c(dmid, w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1)) ab.1.fa-get.scores(ab.1.df[-1]) ab.2.fa-get.scores(ab.2.df[-1]) ab.3.fa-get.scores(ab.3.df[-1]) } Thanks for your help, Justin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GGplot 2 – cannot get histogram and box plot axis to match.
Dear John, Use xlim() and ylim() instead of expand_limits() library(ggplot2) #sample data from ggplot2 data(Cars93, package = MASS) dataSet - Cars93 #variables to calculate the range to extend the axis dataVector - unlist(dataSet[,MPG.city]) dataRange - diff(range(dataSet$MPG.city)) graphRange - c(min(dataSet$MPG.city) - dataRange/5, max(dataSet$MPG.city) + dataRange/5) #making the box plot theBoxPlot - ggplot(dataSet,aes_string(x = MPG.city, y = MPG.city)) theBoxPlot - theBoxPlot + geom_boxplot() + coord_flip() + ylim(limits = graphRange) print(theBoxPlot) #making the histogram thePlot - ggplot(dataSet,aes_string(x = MPG.city)) thePlot - thePlot + geom_histogram() + xlim(graphRange) print(thePlot) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens John W. Clow Verzonden: donderdag 25 juli 2013 20:03 Aan: r-help@r-project.org Onderwerp: [R] GGplot 2 – cannot get histogram and box plot axis to match. Problem: I am trying to get the histogram and box plot x axis to match. I’ve tried using the expand_limits function to make the axis match but that didn’t make the axis match. The histogram’s axis are still consistently larger than the ones for the box plot (though the function did help). Does anyone have a suggestion as to what I should do instead? Background: I am building a Shiny app that displays a histogram below a bar chart for a set of data that a user uploads to the app. If you want to see the app, go here http://spark.rstudio.com/jclow/Archive20130725HistogramApp/ To run the app, select “Use Sample Data” , then select “MPG.city” under choose a column, then finally select box plot. Sample code: Below is a snippet of my code to demonstrate the problems I have. library(ggplot2) #sample data from ggplot2 data(Cars93, package = MASS) dataSet - Cars93 #variables to calculate the range to extend the axis dataVector - unlist(dataSet[,MPG.city]) dataRange - max(dataVector) - min(dataVector) graphRange - c(min(dataVector) - dataRange/5, max(dataVector) + dataRange/5) #making the box plot theBoxPlot - ggplot(dataSet,aes_string(x = MPG.city,y = MPG.city)) theBoxPlot = theBoxPlot + geom_boxplot() + expand_limits(y= graphRange) + coord_flip() print(theBoxPlot) #making the histogram thePlot - ggplot(dataSet,aes_string(x = MPG.city)) thePlot -thePlot + geom_histogram() + expand_limits(x= graphRange) print(thePlot) Thank you for taking the time to read this. John Clow UCSB Student __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme (weights) and glht
Dear Sibylle, Have you tried to create a new variable? ME$fDiversity - factor(ME$Diversity) H08_lme - lme( log(Height2008_mean) ~ fDiversity, data = ME, random = ~ 1|Plot/SubPlot, weights = varPower(form = ~Diversity), na.action = na.omit, subset = ME$Species == Ace_pse, method = ML ) summary(H08_lme) anova(H08_lme) g_H08_lme - glht(H08_lme, linfct = mcp(fDiversity = Tukey)) Please note that Iâve added (a lot of) whitespace to your code to make it easier to read. Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.bemailto:thierry.onkel...@inbo.be www.inbo.behttp://www.inbo.be/ To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Sibylle Stöckli Verzonden: donderdag 25 juli 2013 12:16 Aan: r-help@r-project.org Onderwerp: [R] lme (weights) and glht Dear R members, I tried to fit an lme model and to use the glht function of multcomp. However, the glht function gives me some errors when using weights=varPower(). The glht error makes sense as glht needs factor levels and the model works fine without weights=. Does anyone know a solution so I do not have to change the lme model? Thanks Sibylle -- works fine ME$Diversity=factor(ME$Diversity) H08_lme-lme(log(Height2005_mean)~Diversity, data=ME, random=~1|Plot/ SubPlot, na.action=na.omit, subset=ME$Species==Pse_men, method=ML) summary(H08_lme) anova(H08_lme) g_H08_lme-glht(H08_lme, linfct=mcp(Diversity=Tukey)) print(summary(g_H08_lme)) -- using lme with weights I changed the order of factor() and introduced as.factor in the model H08_lme-lme(log(Height2008_mean)~as.factor(Diversity), data=ME, random=~1|Plot/SubPlot, weights=varPower(form=~Diversity), na.action=na.omit, subset=ME$Species==Ace_pse, method=ML) summary(H08_lme) anova(H08_lme) ME$Diversity=factor(ME$Diversity) g_H08_lme-glht(H08_lme, linfct=mcp(Diversity=Tukey)) Error in mcp2matrix(model, linfct = linfct) : Variable(s) �Diversity� have been specified in �linfct� but cannot be found in �model�! [[alternative HTML version deleted]] __ R-help@r-project.orgmailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] number of items to replace is not a multiple of replacement length
Hello, Once again, the matrix EWMA has not the correct size. Did you carefully read the answer by Thomas Stewart? https://stat.ethz.ch/pipermail/r-help/attachments/20130724/c454b0f7/attachment.pl Extract of his reply: When you expand the example to 5 stocks, there will be 15 elements (5 variances and 10 covariances) EWMA-matrix(nrow=T,ncol=15)) Regards, Pascal On 26/07/2013 14:37, G Girija wrote: Hi All, I have 5 stock values and i am calculating EWMA followed the logic as given ind following link.[ http://www.orecastingfinancialrisk.com/3.htmlhttp://www.forecastingfinancialrisk.com/3.html ] library('tseries') returns[,1]-returns[,1]-mean(returns[,1]) returns[,2]-returns[,2]-mean(returns[,2]) returns[,3]-returns[,3]-mean(returns[,3]) returns[,4]-returns[,4]-mean(returns[,4]) returns[,5]-returns[,5]-mean(returns[,5]) T-length(returns[,1]) T EWMA-matrix(nrow=T,ncol=5) lambda=0.94 S-cov(returns) S EWMA[1,] - S[lower.tri(S,diag=TRUE)] *Error in EWMA[1, ] - S[lower.tri(S, diag = TRUE)] : * * number of items to replace is not a multiple of replacement length* * * for(i in 2:T) { S- lambda*S +(1-lambda)*t(returns[i])%*% returns[i] EWMA[i,] - S[lower.tri(S,diag=TRUE)] } * * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maintaining data order in factanal with missing data
Hi You provided functions, so far so good. But without data it would be quite difficult to understand what the functions do and where could be the issue. I suspect combination of complete cases selection together with subset and factor behaviour. But I can be completely out of target too. Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of s00123...@myacu.edu.au Sent: Friday, July 26, 2013 9:35 AM To: r-help@r-project.org Subject: [R] Maintaining data order in factanal with missing data Hi, I'm new to R, so sorry if this is a simple answer. I'm currently trying to collapse some ordinal variables into a composite; the program ideally should take a data frame as input, perform a factor analysis, compute factor scores, sds, etc., and return the rescaled scores and loadings. The difficulty I'm having is that my data set contains a number of NA, which I am excluding from the analysis using complete.cases(), and thus the incomplete cases are skipped. These functions are for a longitudinal data set with repeated waves of data, so the final rescaled scores from each wave need to be saved as variables grouped by a unique ID (DMID). The functions I'm trying to implement are as follows: weighted.sd-function(x,w){ sum.w-sum(w) sum.w2-sum(w^2) mean.w-sum(x*w)/sum(w) x.sd.w-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2)) return(x.sd.w) } re.scale-function(f.scores, raw.data, loadings){ fz.scores-(f.scores+mean(f.scores))/(sd(f.scores)) means-apply(raw.data,1,weighted.mean,w=loadings) sds-apply(raw.data,1,weighted.sd,w=loadings) grand.mean-mean(means) grand.sd-mean(sds) final.scores-((fz.scores*grand.sd)+grand.mean) return(final.scores) } get.scores-function(data){ fact- factanal(data[complete.cases(data),],factors=1,scores=regression) f.scores-fact$scores[,1] f.loads-fact$loadings[,1] rescaled.scores-re.scale(f.scores, data[complete.cases(data),], f.loads) output.list-list(rescaled.scores, f.loads) names(output.list)- c(rescaled.scores, factor.loadings) return(output.list) } init.dfs-function(){ ab.1.df-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1)) ab.2.df-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1)) ab.3.df-subset(ab.df,,select=c(dmid, w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1)) ab.1.fa-get.scores(ab.1.df[-1]) ab.2.fa-get.scores(ab.2.df[-1]) ab.3.fa-get.scores(ab.3.df[-1]) } Thanks for your help, Justin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to split two levels several times?
Hi Rui Arun, really thanks for investing so much time to deal with this problem! The code works now for this specific example. However it is not generally robust for slightly different situations. For instance it cannot correctly handle a slight variation of the table where I have again 4 types of electrodes of certain lengths. Electrode4 exists only 6 times. At the transition of the combinations 3-4 and 4-1 there are 12 times electrode4 which stick together in the output $`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky questions. New table XXX electrode length electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode3 260 electrode3 176 electrode3 137 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode3 260 electrode3 176 electrode3 137 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode3 260 electrode3 176 electrode3 137 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode3 260 electrode3 176 electrode3 137 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode3 260 electrode3 176 electrode3 137 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode3 260 electrode3 176 electrode3 137 Gesendet: Donnerstag, 25. Juli 2013 um 20:53 Uhr Von: Rui Barradas ruipbarra...@sapo.pt An: dennis1...@gmx.net Cc: r-help@r-project.org Betreff: Re: Aw: Re: Re: Re: [R] How to split two levels several times? Hello, I think the following does what you want. (I don't know if it makes much sense but it works.) lens - rle(as.character(XXX$electrode))$lengths m - length(lens) %/% 2 idx - rep(1:m, sapply(1:m, function(.m) sum(lens[(2*.m - 1):(2*.m)]))) if(length(lens) %% 2 != 0){ idx - c(idx, rep(m + 1, lens[length(lens)])) sp_idx - split(idx, idx) n - length(sp_idx[[m]]) if(n %/% 2 length(sp_idx[[m + 1]])) sp_idx[[m]][(n %/% 2 + 1):n] - sp_idx[[m + 1]][1] else sp_idx[[m]][(n - length(sp_idx[[m + 1]]) + 1):n] - sp_idx[[m + 1]][1] idx - unlist(sp_idx) } sp - split(XXX, idx) sp Rui Barradas Em 25-07-2013 11:40, dennis1...@gmx.net escreveu: Hi Rui once more thank you for your help. But the code does so far not solve the problem because it still treats rows 17-22 (repeated appearance of electrode1) as one single level. However as can be seen by rows 1-3 (or rows 17-19 and rows 20-22) and the order of the length variable (row 1 = 5.7, row 2 = 6.3, row 3 = 6.2) electrode1 consists only of 3 rows. Maybe that was not made absolutely clear by me. As described in my mail before if by chance (or systematically) it happens to be that electrode1 appears right after each other in the table then the code should split it “half way”. So idx should not return [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 but instead 6 times number 4 at the end [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 Do you have any solution? Gesendet: Mittwoch, 24. Juli 2013 um 23:47 Uhr Von: Rui Barradas ruipbarra...@sapo.pt An: dennis1...@gmx.net Cc: r-help@r-project.org Betreff: Re: Aw: Re: Re: [R] How to split two levels several times? Hello, As for the first question, note that in the case you describe, the resulting list of df's will not be a split of the original, there will be a duplication in the final 4-1 and 1-3. The following is a hack but will do it. lens -
[R] a very urgunt and important question about R
Dear all I am doing a research in clinical psychology and I need to Generate a block randomization for a clinical trial, with two treatments and three therapists. I mean I must randomize the clients (all women) to two kinds of treatment (called treatment as usual and narrative exposure therapy) and three different therapists. I tried to do this randomization by using R software and by using stratification package but I didn't (and still can not) understand how I should use R for this job. Could you please help me and show me how I should use R for this kind of randomization in a really easy way? I really appreciate your help in advance. Best Mina P.S. I just know little about statistics and software. Please teach me in a really easy way. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help R
Hi to everyone, first of all thanks for this service, it is being very useful for me, thanks in advance. I am new in R, so I suppose I could make really naive questions, I'm sorry. I have to impute some missing values and I am trying to do it with VIM library trough Hot Deck imputation. I writte:vmGUImenu(), and it opens a small window of: Visualization and Imputation of Missing Values and I select Imptation and Hot Deck and then one of the variables which I have to select is Select Variables to Build Domains. I don't know which variables I have to select, I don't understand this. I have tried don't put anything and I get : hotdeck(dataframe,variable=c(CRV.IE.2005,CRV.IE.2006,CRV.IE.2007,CRV.IE.2008,CRV.IE.2009,CR V.IE.2010),ord_var=c(CRV.IE.2001,CRV.IE.2002,CRV.IE.2003,CRV.IE.2004,CRV.IE.2005,CRV.IE.20 06,CRV.IE.2007,CRV.IE.2008,CRV.IE.2009,CRV.IE.2010),domain_var=NULL,imp_suffix=_imp) Mensajes de aviso perdido: In hotdeck(data, variable = vars, ord_var = sort, domain_var = domain, Some NAs remained, maybe due to a too restrictive domain building!? In hotdeck(b, variable = c(CRV.IE.2005, CRV.IE.2006, CRV.IE.2007, Some NAs remained, maybe due to a too restrictive domain building!? What should I put in this variable?? Thanks in advance Best regards Teresa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] n-dash instead of hyphens in plots
Dear community, How do I change the hyphens that appear when using '-' to n-dash in, for example, x-axes? Example code: axis(1, c(1:2), c(1-4, 23-26), tck=.05) Thanks Anders [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] n-dash instead of hyphens in plots
On 26/07/2013 09:36, Anders Sand wrote: Dear community, How do I change the hyphens that appear when using '-' to n-dash in, for example, x-axes? Example code: axis(1, c(1:2), c(1-4, 23-26), tck=.05) On what device on what platform? The commonest answer would be 'if you want en-dash not hyphen, ask for en-dash and not hyphen' en-dash is \u2013 where that is supported. Thanks Anders [[alternative HTML version deleted]] PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This did ask for 'at a minimum' information and no HTML. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a very urgunt and important question about R
Hi I recommend to start with library(fortunes) fortune(surgery) Anyway, beside Introduction to R you could go through CRAN Task View: Design of Experiments (DoE) Analysis of Experimental Data and specifically through Experimental designs for clinical trials This task view only covers specific design of experiments packages; there may be some grey areas. Please, also consult the ClinicalTrials task view. *experiment contains tools for clinical experiments, e.g., a randomization tool, and it provides a few special analysis options for clinical trials. *Package gsDesign implements group sequential designs, *Package gsbDesign evaluates operating characteristics for group sequential Bayesian designs, *package asd implements adaptive sequential designs. *Package TEQR provides toxicity equivalence range designs (Blanchard and Longmate 2010) for phase I clinical trials. *The DoseFinding package provides functions for the design and analysis of dose-finding experiments (for example pharmaceutical Phase II clinical trials); it combines the facilities of the MCPMod package (maintenance discontinued; described in Bornkamp, Pinheiro and Bretz 2009) with a special type of optimal designs for dose finding situations (MED-optimal designs, or D-optimal designs, or a mixture of both; cf., Dette et al. 2008) Which shall provide you with some insight. Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of mina orang Sent: Friday, July 26, 2013 10:26 AM To: r-help@r-project.org Subject: [R] a very urgunt and important question about R Dear all I am doing a research in clinical psychology and I need to Generate a block randomization for a clinical trial, with two treatments and three therapists. I mean I must randomize the clients (all women) to two kinds of treatment (called treatment as usual and narrative exposure therapy) and three different therapists. I tried to do this randomization by using R software and by using stratification package but I didn't (and still can not) understand how I should use R for this job. Could you please help me and show me how I should use R for this kind of randomization in a really easy way? I really appreciate your help in advance. Best Mina P.S. I just know little about statistics and software. Please teach me in a really easy way. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a very urgunt and important question about R
Hi, You may try: library(psych) ?block.random() Also, look this link http://personality-project.org/revelle/syllabi/205/block.randomization.pdf A.K. - Original Message - From: mina orang minaor...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, July 26, 2013 4:26 AM Subject: [R] a very urgunt and important question about R Dear all I am doing a research in clinical psychology and I need to Generate a block randomization for a clinical trial, with two treatments and three therapists. I mean I must randomize the clients (all women) to two kinds of treatment (called treatment as usual and narrative exposure therapy) and three different therapists. I tried to do this randomization by using R software and by using stratification package but I didn't (and still can not) understand how I should use R for this job. Could you please help me and show me how I should use R for this kind of randomization in a really easy way? I really appreciate your help in advance. Best Mina P.S. I just know little about statistics and software. Please teach me in a really easy way. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Jul 26, 2013; 12:34am
Thanks Rich and Jim and apologies for omitting the line x - c(285, 43.75, 94, 150, 214, 375, 270, 350, 41.5, 210, 30, 37.6, 281, 101, 210) But the fundamental problem remains that vertical spacing is not correct unless I waste a lot of image space at the top. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated measures Cox regression ??coxph??
Two choices. If this were a linear model, do you like the GEE approach or a mixed effects approach? Assume that subject is a variable containing a per-subject identifier. GEE approach: add + cluster(subject) to the model statement in coxph Mixed models approach: Add + (1|subject) to the model statment in coxme. When only a very few subjects have multiple events, the mixed model (random effect) approach may not be reliable, however. Multiple events per group are the fuel for estimation of the variance of the random effect, and with few of these the profile likelihood of the random effect will be very flat. You can get esssentially a random estimate of the variance of the subject effect. I'm still getting my arms around this issue, and it has taken me a long time. Frailty is an alternate label for random effects when all we have is a random intercept. Multiple labels for the same idea adds confusion, but nothing else. Terry Therneau On 07/25/2013 08:14 PM, Marc Schwartz wrote: On Jul 25, 2013, at 4:45 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote: On Jul 25, 2013, at 2:11 PM, John Sorkinjsor...@grecc.umaryland.edu wrote: Colleagues, Is there any R package that will allow one to perform a repeated measures Cox Proportional Hazards regression? I don't think coxph is set up to handle this type of problem, but I would be happy to know that I am not correct. I am doing a study of time to hip joint replacement. As each person has two hips, a given person can appear in the dataset twice, once for the left hip and once for the right hip, and I need to account for the correlation of data from a single individual. Thank you, John John, See Terry's 'coxme' package: http://cran.r-project.org/web/packages/coxme/index.html When I looked over the description of coxme, I was concerned it was not really designed with this in mind. Looking at Therneau and Grambsch, I thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the analysis question well. There they compared the use of coxph( ...+cluster(ID),,...) withcoxph( ...+strata(ID),,...). Unfortunately I could not tell for sure which one was being described as superio but I think it was the cluster() alternative. I seem to remember there are discussions in the archives. David, I think that you raise a good point. The example in the book (I had to wait to get home to read it) is potentially different however, in that the subject's eye's were randomized to treatment or control, which would seem to suggest comparable baseline characteristics for each pair of eyes, as well as an active intervention on one side where a difference in treatment effect between each eye is being analyzed. It is not clear from John's description above if there is one hip that will be treated versus one as a control and whether the extent of disease at baseline is similar in each pair of hips. Presumably the timing of hip replacements will be staggered at some level, even if there is comparable disease, simply due to post-op recovery time and surgical risk. In cases where the disease between each hip is materially different, that would be another factor to consider, however I would defer to orthopaedic physicians/surgeons from a subject matter expertise consideration. It is possible that the bilateral hip replacement data might be more of a parallel to bilateral breast cancer data, if each breast were to be tracked separately. I have cc'd Terry here, hoping that he might jump in and offer some insights into the pros/cons of using coxme versus coxph with either a cluster or strata based approach, or perhaps even a frailty based approach as in 9.4.1 in the book. Regards, Marc -- David. You also might find the following of interest: http://bjo.bmj.com/content/71/9/645.full.pdf http://www.ncbi.nlm.nih.gov/pubmed/6885 http://www.ncbi.nlm.nih.gov/pubmed/22078901 Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to split two levels several times?
Hi Dennis, No problem. As I mentioned, I was under the understanding that electrode1 occurs only in multiples of 3. I think your earlier post sounded like that. May be I was misunderstood. So, my solution was based on that. In the below table, I am not sure, how you wanted to split. For ex. it is not clear, whether rows 1-5 (electrode1) are only in the first list element or should it include the next electrode electrode2 also? Could you also show the expected output? dat$Len # [1] 5 4 5 3 5 6 4 5 4 3 4 6 3 5 3 4 3 12 5 6 4 6 3 as.character(dat$Val) # [1] electrode1 electrode2 electrode1 electrode3 electrode1 #[6] electrode4 electrode2 electrode1 electrode2 electrode3 #[11] electrode2 electrode4 electrode3 electrode1 electrode3 #[16] electrode2 electrode3 electrode4 electrode1 electrode4 #[21] electrode2 electrode4 electrode3 A.K. Hi Rui Arun, really thanks for investing so much time to deal with this problem! The code works now for this specific example. However it is not generally robust for slightly different situations. For instance it cannot correctly handle a slight variation of the table where I have again 4 types of electrodes of certain lengths. Electrode4 exists only 6 times. At the transition of the combinations 3-4 and 4-1 there are 12 times electrode4 which stick together in the output $`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky questions. New table XXX electrode length electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode3 260 electrode3 176 electrode3 137 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode3 260 electrode3 176 electrode3 137 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode3 260 electrode3 176 electrode3 137 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode3 260 electrode3 176 electrode3 137 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode3 260 electrode3 176 electrode3 137 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode3 260 electrode3 176 electrode3 137 - Original Message - From: arun smartpink...@yahoo.com To: dennis1...@gmx.net dennis1...@gmx.net Cc: Sent: Friday, July 26, 2013 1:05 AM Subject: Re: [R] How to split two levels several times? Just to add: I assumed that electrode1 would be found in multiples of 3. I could be wrong. - Original Message - From: arun smartpink...@yahoo.com To: Rui Barradas ruipbarra...@sapo.pt Cc: dennis1...@gmx.net dennis1...@gmx.net; r-help@r-project.org r-help@r-project.org Sent: Friday, July 26, 2013 12:53 AM Subject: Re: [R] How to split two levels several times? May be this also helps: XXX: dataset rl-rle(as.character(XXX$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode1 (rl$lengths[i]%/%31)) rep(3,rl$lengths[i]%/%3) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX[seq(vec2[i],max(x1)),]}) res #[[1]] # electrode length #1 electrode1 5.7 #2 electrode1 6.3 #3 electrode1 6.2 #4 electrode2 11.4 #5 electrode2 9.7 # #[[2]] # electrode length #6 electrode3 14.2 #7 electrode3 14.8 #8 electrode3
Re: [R] Maintaining data order in factanal with missing data
Hi Well, the function init.dfs does nothing as all data frames created inside it does not propagate to global environment and there is nothing what the function returns. Tha last line (when used outside a function) gives warnings but there is no sign of error. When head(ab.1.df) dmid g5oab2 g53 g54 g55 g5ovb1 11 1.418932 1.805227 2.791152 3.624116 3.425586 22 2.293907 1.187830 1.611237 1.748526 3.816533 33 2.836536 2.679523 1.279639 2.674986 2.452395 44 1.872259 3.278359 1.785872 2.458315 1.146480 55 1.467195 1.180747 3.564127 3.007682 2.109506 66 3.098512 3.151974 3.969379 3.750571 1.497358 head(ab.2.df) dmid w2oab3 w22 w23 w24 w2ovb1 11 4.831362 5.522764 7.809366 6.969172 7.398385 22 6.706346 4.101742 1.434697 5.266775 5.357641 33 3.653806 2.666885 1.209326 5.125556 4.963374 44 7.221255 7.649152 6.540398 6.648506 2.576081 55 1.848023 5.044314 2.761881 3.307220 1.454234 66 7.606429 4.911766 2.034813 2.638573 2.818834 head(ab.3.df) dmid w3oab3 w3oab4 w3oab7 w3oab8 w3ovb1 11 5.835609 6.108220 6.587721 2.451461 2.785467 22 4.973198 1.196815 6.388056 1.110877 4.226463 33 3.800367 6.697287 5.235345 6.666829 6.319073 44 1.093141 1.43 2.269252 3.194978 4.916342 55 1.975060 7.204516 4.825435 1.775874 3.484027 66 3.273361 2.243805 5.326547 5.720892 6.118723 str(ab.1.fa) List of 2 $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912 0.9975 ..- attr(*, names)= chr [1:5] g5oab2 g53 g54 g55 ... str(ab.2.fa) List of 2 $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119 0.7138 ..- attr(*, names)= chr [1:5] w2oab3 w22 w23 w24 ... str(ab.3.fa) List of 2 $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975 ..- attr(*, names)= chr [1:5] w3oab3 w3oab4 w3oab7 w3oab8 ... Anyway I have no idea what you consider wrong? Regards Petr -Original Message- From: Justin Delahunty [mailto:a...@genius.net.au] Sent: Friday, July 26, 2013 2:22 PM To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Petr, Thanks for the quick response. Unfortunately I cannot share the data I am working with, however please find attached a suitable R workspace with generated data. It has the appropriate variable names, only the data has been changed. The last function in the list (init.dfs()) I call to subset the overall data set into the three waves, then conduct the factor analysis on each (1 factor CFA); it's just in a function to ease re-typing in a new workspace. Thanks, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 7:35 PM To: Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi You provided functions, so far so good. But without data it would be quite difficult to understand what the functions do and where could be the issue. I suspect combination of complete cases selection together with subset and factor behaviour. But I can be completely out of target too. Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of s00123...@myacu.edu.au Sent: Friday, July 26, 2013 9:35 AM To: r-help@r-project.org Subject: [R] Maintaining data order in factanal with missing data Hi, I'm new to R, so sorry if this is a simple answer. I'm currently trying to collapse some ordinal variables into a composite; the program ideally should take a data frame as input, perform a factor analysis, compute factor scores, sds, etc., and return the rescaled scores and loadings. The difficulty I'm having is that my data set contains a number of NA, which I am excluding from the analysis using complete.cases(), and thus the incomplete cases are skipped. These functions are for a longitudinal data set with repeated waves of data, so the final rescaled scores from each wave need to be saved as variables grouped by a unique ID (DMID). The functions I'm trying to implement are as follows: weighted.sd-function(x,w){ sum.w-sum(w) sum.w2-sum(w^2) mean.w-sum(x*w)/sum(w) x.sd.w-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2)) return(x.sd.w) } re.scale-function(f.scores, raw.data,
[R] modeest with non-numeric data?
Hello, I have recently discovered the modeest library, and am trying to understand how to use it with non-numeric data (e.g. determining the most common last name, or analysing customer demographics âby zip code). I have the mlv() function working for numeric (double and integer) data, but it throws either an error or a warning and produces unexpected output with character data. Any help is appreciated. A simple example: my.rand.letters - sample(letters, size=100, replace=TRUE) mlv(my.rand.letters, mode=C(discrete))Error in match.arg(x, .distribList) : 'arg' must be of length 1In addition: There were 21 warnings (use warnings() to see them) mlv(as.factor(my.rand.letters))Mode (most frequent value): NA NA Bickel's modal skewness: -2 Call: mlv.factor(x = as.factor(my.rand.letters)) Warning message:In discrete(x, ...) : NAs introduced by coercion TIA, Tom [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to split two levels several times?
Hi Dennis, I guess in this case, instead of Eletrode1 occuring 3 times, it is Electrode4 exists only 6 times. If that is the situation: just change: XXX: data rl-rle(as.character(XXX$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode4 (rl$lengths[i]%/%61)) rep(6,rl$lengths[i]%/%6) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX[seq(vec2[i],max(x1)),]}) res [[1]] electrode length 1 electrode1 206 2 electrode1 194 3 electrode1 182 4 electrode1 172 5 electrode1 169 6 electrode2 82 7 electrode2 78 8 electrode2 70 9 electrode2 58 [[2]] electrode length 10 electrode1 206 11 electrode1 194 12 electrode1 182 13 electrode1 172 14 electrode1 169 15 electrode3 260 16 electrode3 176 17 electrode3 137 [[3]] electrode length 18 electrode1 206 19 electrode1 194 20 electrode1 182 21 electrode1 172 22 electrode1 169 23 electrode4 86 24 electrode4 66 25 electrode4 64 26 electrode4 52 27 electrode4 27 28 electrode4 26 [[4]] electrode length 29 electrode2 82 30 electrode2 78 31 electrode2 70 32 electrode2 58 33 electrode1 206 34 electrode1 194 35 electrode1 182 36 electrode1 172 37 electrode1 169 [[5]] electrode length 38 electrode2 82 39 electrode2 78 40 electrode2 70 41 electrode2 58 42 electrode3 260 43 electrode3 176 44 electrode3 137 [[6]] electrode length 45 electrode2 82 46 electrode2 78 47 electrode2 70 48 electrode2 58 49 electrode4 86 50 electrode4 66 51 electrode4 64 52 electrode4 52 53 electrode4 27 54 electrode4 26 [[7]] electrode length 55 electrode3 260 56 electrode3 176 57 electrode3 137 58 electrode1 206 59 electrode1 194 60 electrode1 182 61 electrode1 172 62 electrode1 169 [[8]] electrode length 63 electrode3 260 64 electrode3 176 65 electrode3 137 66 electrode2 82 67 electrode2 78 68 electrode2 70 69 electrode2 58 [[9]] electrode length 70 electrode3 260 71 electrode3 176 72 electrode3 137 73 electrode4 86 74 electrode4 66 75 electrode4 64 76 electrode4 52 77 electrode4 27 78 electrode4 26 [[10]] electrode length 79 electrode4 86 80 electrode4 66 81 electrode4 64 82 electrode4 52 83 electrode4 27 84 electrode4 26 85 electrode1 206 86 electrode1 194 87 electrode1 182 88 electrode1 172 89 electrode1 169 [[11]] electrode length 90 electrode4 86 91 electrode4 66 92 electrode4 64 93 electrode4 52 94 electrode4 27 95 electrode4 26 96 electrode2 82 97 electrode2 78 98 electrode2 70 99 electrode2 58 [[12]] electrode length 100 electrode4 86 101 electrode4 66 102 electrode4 64 103 electrode4 52 104 electrode4 27 105 electrode4 26 106 electrode3 260 107 electrode3 176 108 electrode3 137 A.K. - Original Message - From: dennis1...@gmx.net dennis1...@gmx.net To: Rui Barradas ruipbarra...@sapo.pt; r-help@r-project.org Cc: Sent: Friday, July 26, 2013 6:07 AM Subject: Re: [R] How to split two levels several times? Hi Rui Arun, really thanks for investing so much time to deal with this problem! The code works now for this specific example. However it is not generally robust for slightly different situations. For instance it cannot correctly handle a slight variation of the table where I have again 4 types of electrodes of certain lengths. Electrode4 exists only 6 times. At the transition of the combinations 3-4 and 4-1 there are 12 times electrode4 which stick together in the output $`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky questions. New table XXX electrode length electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode3 260 electrode3 176 electrode3 137 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4 26 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode1 206 electrode1 194 electrode1 182 electrode1 172 electrode1 169 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode3 260 electrode3 176 electrode3 137 electrode2 82 electrode2 78 electrode2 70 electrode2 58 electrode4 86 electrode4 66 electrode4 64 electrode4 52 electrode4 27 electrode4
Re: [R] How to split two levels several times?
It would be better to wrap it in a function. fun1- function(x,colName,N,value){ rl- rle(as.character(x[,colName])) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==value (rl$lengths[i]%/%N1)) rep(N,rl$lengths[i]%/%N) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; x[seq(vec2[i],max(x1)),]}) res } fun1(XXX,electrode,6,electrode4) #Using previous dataset XXX, XXX1, XXX2 fun1(XXX,electrode,3,electrode1) fun1(XXX1,electrode,3,electrode1) fun1(XXX2,electrode,3,electrode1) A.K. - Original Message - From: arun smartpink...@yahoo.com To: dennis1...@gmx.net dennis1...@gmx.net Cc: R help r-help@r-project.org; Rui Barradas ruipbarra...@sapo.pt Sent: Friday, July 26, 2013 9:26 AM Subject: Re: [R] How to split two levels several times? Hi Dennis, I guess in this case, instead of Eletrode1 occuring 3 times, it is Electrode4 exists only 6 times. If that is the situation: just change: XXX: data rl-rle(as.character(XXX$electrode)) dat-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1-if(rl$values[i]==electrode4 (rl$lengths[i]%/%61)) rep(6,rl$lengths[i]%/%6) else rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])})) lst1-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1) vec1-sapply(lst1,max) vec2-c(1,vec1[-length(vec1)]+1) res- lapply(seq_along(lst1),function(i) {x1-lst1[[i]]; XXX[seq(vec2[i],max(x1)),]}) res [[1]] electrode length 1 electrode1 206 2 electrode1 194 3 electrode1 182 4 electrode1 172 5 electrode1 169 6 electrode2 82 7 electrode2 78 8 electrode2 70 9 electrode2 58 [[2]] electrode length 10 electrode1 206 11 electrode1 194 12 electrode1 182 13 electrode1 172 14 electrode1 169 15 electrode3 260 16 electrode3 176 17 electrode3 137 [[3]] electrode length 18 electrode1 206 19 electrode1 194 20 electrode1 182 21 electrode1 172 22 electrode1 169 23 electrode4 86 24 electrode4 66 25 electrode4 64 26 electrode4 52 27 electrode4 27 28 electrode4 26 [[4]] electrode length 29 electrode2 82 30 electrode2 78 31 electrode2 70 32 electrode2 58 33 electrode1 206 34 electrode1 194 35 electrode1 182 36 electrode1 172 37 electrode1 169 [[5]] electrode length 38 electrode2 82 39 electrode2 78 40 electrode2 70 41 electrode2 58 42 electrode3 260 43 electrode3 176 44 electrode3 137 [[6]] electrode length 45 electrode2 82 46 electrode2 78 47 electrode2 70 48 electrode2 58 49 electrode4 86 50 electrode4 66 51 electrode4 64 52 electrode4 52 53 electrode4 27 54 electrode4 26 [[7]] electrode length 55 electrode3 260 56 electrode3 176 57 electrode3 137 58 electrode1 206 59 electrode1 194 60 electrode1 182 61 electrode1 172 62 electrode1 169 [[8]] electrode length 63 electrode3 260 64 electrode3 176 65 electrode3 137 66 electrode2 82 67 electrode2 78 68 electrode2 70 69 electrode2 58 [[9]] electrode length 70 electrode3 260 71 electrode3 176 72 electrode3 137 73 electrode4 86 74 electrode4 66 75 electrode4 64 76 electrode4 52 77 electrode4 27 78 electrode4 26 [[10]] electrode length 79 electrode4 86 80 electrode4 66 81 electrode4 64 82 electrode4 52 83 electrode4 27 84 electrode4 26 85 electrode1 206 86 electrode1 194 87 electrode1 182 88 electrode1 172 89 electrode1 169 [[11]] electrode length 90 electrode4 86 91 electrode4 66 92 electrode4 64 93 electrode4 52 94 electrode4 27 95 electrode4 26 96 electrode2 82 97 electrode2 78 98 electrode2 70 99 electrode2 58 [[12]] electrode length 100 electrode4 86 101 electrode4 66 102 electrode4 64 103 electrode4 52 104 electrode4 27 105 electrode4 26 106 electrode3 260 107 electrode3 176 108 electrode3 137 A.K. - Original Message - From: dennis1...@gmx.net dennis1...@gmx.net To: Rui Barradas ruipbarra...@sapo.pt; r-help@r-project.org Cc: Sent: Friday, July 26, 2013 6:07 AM Subject: Re: [R] How to split two levels several times? Hi Rui Arun, really thanks for investing so much time to deal with this problem! The code works now for this specific example. However it is not generally robust for slightly different situations. For instance it cannot correctly handle a slight variation of the table where I have again 4 types of electrodes of certain lengths. Electrode4 exists only 6 times. At the transition of the combinations 3-4 and 4-1 there are 12 times electrode4 which stick together in the output $`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky questions. New table
Re: [R] help on carrying forward several vectors of rownames
Can you supply some code and data? At the moment the question in not clear, or not expressed in R terminology. Maybe I'm just not awake yet but I tried to insert these vectors as a list, e.g. row.names=c(1,4), from the excel file does not seem to make any sense at all probably because 'list' has a special meaning in R and I don't think that is what you mean. if it is, my appologies. Perhaps have a look at https://github.com/hadley/devtools/wiki/Reproducibility http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example John Kane Kingston ON Canada -Original Message- From: theobrom...@gmail.com Sent: Thu, 25 Jul 2013 22:00:32 -0700 To: r-help@r-project.org Subject: [R] help on carrying forward several vectors of rownames Dear reader, I'm trying to use multiple vectors as rownames for the read.table function. I tried to insert these vectors as a list, e.g. row.names=c(1,4), from the excel file. I tried other ways, as if the argument only took continuous tab separated columns, e.g. row.names=c(1:4). But, this also did not work. Is there a solution to this problem? Regards, -- Franklin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maintaining data order in factanal with missing data
Hi Petr, Thanks for the quick response. Unfortunately I cannot share the data I am working with, however please find attached a suitable R workspace with generated data. It has the appropriate variable names, only the data has been changed. The last function in the list (init.dfs()) I call to subset the overall data set into the three waves, then conduct the factor analysis on each (1 factor CFA); it's just in a function to ease re-typing in a new workspace. Thanks, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 7:35 PM To: Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi You provided functions, so far so good. But without data it would be quite difficult to understand what the functions do and where could be the issue. I suspect combination of complete cases selection together with subset and factor behaviour. But I can be completely out of target too. Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of s00123...@myacu.edu.au Sent: Friday, July 26, 2013 9:35 AM To: r-help@r-project.org Subject: [R] Maintaining data order in factanal with missing data Hi, I'm new to R, so sorry if this is a simple answer. I'm currently trying to collapse some ordinal variables into a composite; the program ideally should take a data frame as input, perform a factor analysis, compute factor scores, sds, etc., and return the rescaled scores and loadings. The difficulty I'm having is that my data set contains a number of NA, which I am excluding from the analysis using complete.cases(), and thus the incomplete cases are skipped. These functions are for a longitudinal data set with repeated waves of data, so the final rescaled scores from each wave need to be saved as variables grouped by a unique ID (DMID). The functions I'm trying to implement are as follows: weighted.sd-function(x,w){ sum.w-sum(w) sum.w2-sum(w^2) mean.w-sum(x*w)/sum(w) x.sd.w-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2)) return(x.sd.w) } re.scale-function(f.scores, raw.data, loadings){ fz.scores-(f.scores+mean(f.scores))/(sd(f.scores)) means-apply(raw.data,1,weighted.mean,w=loadings) sds-apply(raw.data,1,weighted.sd,w=loadings) grand.mean-mean(means) grand.sd-mean(sds) final.scores-((fz.scores*grand.sd)+grand.mean) return(final.scores) } get.scores-function(data){ fact- factanal(data[complete.cases(data),],factors=1,scores=regression) f.scores-fact$scores[,1] f.loads-fact$loadings[,1] rescaled.scores-re.scale(f.scores, data[complete.cases(data),], f.loads) output.list-list(rescaled.scores, f.loads) names(output.list)- c(rescaled.scores, factor.loadings) return(output.list) } init.dfs-function(){ ab.1.df-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1)) ab.2.df-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1)) ab.3.df-subset(ab.df,,select=c(dmid, w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1)) ab.1.fa-get.scores(ab.1.df[-1]) ab.2.fa-get.scores(ab.2.df[-1]) ab.3.fa-get.scores(ab.3.df[-1]) } Thanks for your help, Justin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maintaining data order in factanal with missing data
Hi Petr, So sorry, I accidentally attached the complete data set rather than the one with missing values. I've attached the correct file to this email. RE: init.dfs() being local, I hadn't even thought of that. I've been away from OOP for close to 15 years now, so it might be time to revise! The problem I have is that with missing values the list of factor scores returned (ab.w1.fa$factor.scores) does not map onto the originating data frame (ab.w1.df) as it no longer includes the cases which had missing values. So while the original data set for ab.w1.df contains 154 ordered cases, the factor analysis contains only 150. I am seeking a way to map the values derived from the factor analysis (ab.w1.fa$factor.scores) back to their original ordered position, so that these factor score variables may be merged back into the master data frame (ab.df). A unique ID for each case is available ($dmid) which I had thought to use when merging the new variables, however I don't know how to implement this. Thanks for your help, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 10:59 PM To: Justin Delahunty; Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Well, the function init.dfs does nothing as all data frames created inside it does not propagate to global environment and there is nothing what the function returns. Tha last line (when used outside a function) gives warnings but there is no sign of error. When head(ab.1.df) dmid g5oab2 g53 g54 g55 g5ovb1 11 1.418932 1.805227 2.791152 3.624116 3.425586 22 2.293907 1.187830 1.611237 1.748526 3.816533 33 2.836536 2.679523 1.279639 2.674986 2.452395 44 1.872259 3.278359 1.785872 2.458315 1.146480 55 1.467195 1.180747 3.564127 3.007682 2.109506 66 3.098512 3.151974 3.969379 3.750571 1.497358 head(ab.2.df) dmid w2oab3 w22 w23 w24 w2ovb1 11 4.831362 5.522764 7.809366 6.969172 7.398385 22 6.706346 4.101742 1.434697 5.266775 5.357641 33 3.653806 2.666885 1.209326 5.125556 4.963374 44 7.221255 7.649152 6.540398 6.648506 2.576081 55 1.848023 5.044314 2.761881 3.307220 1.454234 66 7.606429 4.911766 2.034813 2.638573 2.818834 head(ab.3.df) dmid w3oab3 w3oab4 w3oab7 w3oab8 w3ovb1 11 5.835609 6.108220 6.587721 2.451461 2.785467 22 4.973198 1.196815 6.388056 1.110877 4.226463 33 3.800367 6.697287 5.235345 6.666829 6.319073 44 1.093141 1.43 2.269252 3.194978 4.916342 55 1.975060 7.204516 4.825435 1.775874 3.484027 66 3.273361 2.243805 5.326547 5.720892 6.118723 str(ab.1.fa) List of 2 $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912 0.9975 ..- attr(*, names)= chr [1:5] g5oab2 g53 g54 g55 ... str(ab.2.fa) List of 2 $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119 0.7138 ..- attr(*, names)= chr [1:5] w2oab3 w22 w23 w24 ... str(ab.3.fa) List of 2 $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975 ..- attr(*, names)= chr [1:5] w3oab3 w3oab4 w3oab7 w3oab8 ... Anyway I have no idea what you consider wrong? Regards Petr -Original Message- From: Justin Delahunty [mailto:a...@genius.net.au] Sent: Friday, July 26, 2013 2:22 PM To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Petr, Thanks for the quick response. Unfortunately I cannot share the data I am working with, however please find attached a suitable R workspace with generated data. It has the appropriate variable names, only the data has been changed. The last function in the list (init.dfs()) I call to subset the overall data set into the three waves, then conduct the factor analysis on each (1 factor CFA); it's just in a function to ease re-typing in a new workspace. Thanks, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 7:35 PM To: Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi You provided functions, so far so good. But without data it would be quite difficult to understand what the functions do and where could be the issue. I suspect combination of complete cases selection together with subset and factor behaviour. But I can be completely out of target too. Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On
Re: [R] Maintaining data order in factanal with missing data
Hi There are probably better options but merge(data.frame(x=1:154),data.frame(x=names(ab.1.fa[[1]]), y=ab.1.fa[[1]]), all.x=T) gives you data frame with NA when there was missing value in the first data.frame. You probably can automate the process a bit with nrow function. Regards Petr -Original Message- From: Justin Delahunty [mailto:a...@genius.net.au] Sent: Friday, July 26, 2013 3:34 PM To: PIKAL Petr; 'Justin Delahunty'; 'Justin Delahunty'; r-help@r- project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Petr, So sorry, I accidentally attached the complete data set rather than the one with missing values. I've attached the correct file to this email. RE: init.dfs() being local, I hadn't even thought of that. I've been away from OOP for close to 15 years now, so it might be time to revise! The problem I have is that with missing values the list of factor scores returned (ab.w1.fa$factor.scores) does not map onto the originating data frame (ab.w1.df) as it no longer includes the cases which had missing values. So while the original data set for ab.w1.df contains 154 ordered cases, the factor analysis contains only 150. I am seeking a way to map the values derived from the factor analysis (ab.w1.fa$factor.scores) back to their original ordered position, so that these factor score variables may be merged back into the master data frame (ab.df). A unique ID for each case is available ($dmid) which I had thought to use when merging the new variables, however I don't know how to implement this. Thanks for your help, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 10:59 PM To: Justin Delahunty; Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Well, the function init.dfs does nothing as all data frames created inside it does not propagate to global environment and there is nothing what the function returns. Tha last line (when used outside a function) gives warnings but there is no sign of error. When head(ab.1.df) dmid g5oab2 g53 g54 g55 g5ovb1 11 1.418932 1.805227 2.791152 3.624116 3.425586 22 2.293907 1.187830 1.611237 1.748526 3.816533 33 2.836536 2.679523 1.279639 2.674986 2.452395 44 1.872259 3.278359 1.785872 2.458315 1.146480 55 1.467195 1.180747 3.564127 3.007682 2.109506 66 3.098512 3.151974 3.969379 3.750571 1.497358 head(ab.2.df) dmid w2oab3 w22 w23 w24 w2ovb1 11 4.831362 5.522764 7.809366 6.969172 7.398385 22 6.706346 4.101742 1.434697 5.266775 5.357641 33 3.653806 2.666885 1.209326 5.125556 4.963374 44 7.221255 7.649152 6.540398 6.648506 2.576081 55 1.848023 5.044314 2.761881 3.307220 1.454234 66 7.606429 4.911766 2.034813 2.638573 2.818834 head(ab.3.df) dmid w3oab3 w3oab4 w3oab7 w3oab8 w3ovb1 11 5.835609 6.108220 6.587721 2.451461 2.785467 22 4.973198 1.196815 6.388056 1.110877 4.226463 33 3.800367 6.697287 5.235345 6.666829 6.319073 44 1.093141 1.43 2.269252 3.194978 4.916342 55 1.975060 7.204516 4.825435 1.775874 3.484027 66 3.273361 2.243805 5.326547 5.720892 6.118723 str(ab.1.fa) List of 2 $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912 0.9975 ..- attr(*, names)= chr [1:5] g5oab2 g53 g54 g55 ... str(ab.2.fa) List of 2 $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119 0.7138 ..- attr(*, names)= chr [1:5] w2oab3 w22 w23 w24 ... str(ab.3.fa) List of 2 $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... ..- attr(*, names)= chr [1:154] 1 2 3 4 ... $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975 ..- attr(*, names)= chr [1:5] w3oab3 w3oab4 w3oab7 w3oab8 ... Anyway I have no idea what you consider wrong? Regards Petr -Original Message- From: Justin Delahunty [mailto:a...@genius.net.au] Sent: Friday, July 26, 2013 2:22 PM To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Petr, Thanks for the quick response. Unfortunately I cannot share the data I am working with, however please find attached a suitable R workspace with generated data. It has the appropriate variable names, only the data has been changed. The last function in the list (init.dfs()) I call to subset the overall data set into the three waves, then conduct the factor analysis on each (1 factor CFA); it's just in a function to ease re-typing in a new workspace.
Re: [R] prediction survival curves for coxph-models; how to extract the right strata per individual
It would help me to give advice if I knew what you wanted to do with the new curves. Plot, print, extract? A more direct solution to your question will appear in the next release of the code, btw. Terry T. On 07/25/2013 05:00 AM, r-help-requ...@r-project.org wrote: My problem is: I have a coxph.model with several strata and other covariables. Now I want to fit the estimated survival-curves for new data, using survfit.coxph. But this returns a prediction for each stratum per individual. So if I have 15 new individuals and 10 strata, I have 150 fitted surivival curves (or if I don't use the subscripts I have 15 predictions with the curves for all strata pasted together) Is there any possibility to get only the survival curves for the stratum the new individual belongs to? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] X matrix deemed to be singular and cbind
Hi list, While the X matrix deemed to be singular question has been answered in the list for quite a few times, I have a twist to it. I am using the coxph model for survival analysis on a dataset containing over 160,000 instances and 46 independent variables and I have 2 scenarios: 1. If I use cbind on the 46 independent variables (many of which are categorical), coxph runs without any frills. The problem however is that it won't report which of the categorical variables (e.g. VERY HIGH, HIGH, NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH ***, XLOW ., etc). Is there any way to check this? 2. If I don't use cbind, assuming it'll give me the details I am looking for in the previous step, it throws me the X matrix deemed to be singular, more precisely: X matrix deemed to be singular; variable 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 Could anyone please elaborate on how to get around problem #1 or #2? Thanks! SD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Saving multiple rda-files as one rda-file
On Jul 25, 2013, at 7:17 AM, Dark wrote: Hi, Yes maybe I should have been more clear on my problem. I want to append the different data-frames back into one variable ( rbind ) and save it as one R Data file. Indeed. That was the operation I had in mind when I made my suggestions. Perhaps you need to create a set of toy dataframes with similar structure and then the audience can propose solutions. That's the usual process around these parts. -- David. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Differential problem
Hello, I am trying to solve the following algebraic differential equation : dy1 = h - dy3 dy2 = g - dy4 y3 = K1*y1/(y3+y4) y4 = K2*y2/(y3+y4) I tried using the function daspk, but it gives me the following error : out - daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) daspk-- warning.. At T(=R1) and stepsize H (=R2) the nonlinear solver failed to converge repeatedly of with abs (H) = HMIN g, 0 Warning messages: 1: In daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) : repeated convergence test failures on a step - inaccurate Jacobian or preconditioner? 2: In daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) : Returning early. Results are accurate, as far as they go head(out) time 12 3 4 5 6 7 8 9 [1,]0 0.9 1.33 0 0 10 20 1e-04 0.001 0.001 [2,]0 0.9 1.33 0 0 10 20 1e-04 0.001 0.001 I read the documentation but it only says that daspk can only solve DAE of index one maximum, which I do not understand how to determine. I was wondering if I had to use the function radau but I do not get how to do the M matrix? I did not managed to get it with the example. Could it be an error in my program? Here it is : liquide - function(z, C, dC, parameters) { with(as.list(c(C,dC,parameters)),{ Ct = C[1] + C[2] + C[3] + C[4] + C[5] + C[6] + C[7] + C[8] + C[9] + Ceau V - 1 K32 - 6.54*10^2 # m^3/mol K33 - 1.37*10^-2 # m^3/mol K34 - 0.330 # sans unité K35 - 5.81*10^1 # sans unité kf2 - 1.37 # m^3/mol kf3 - 1.37 # m^3/mol kf4 - 8.68*10^-1 # m^3 kf5 - 157.2*10^-6# m^3 K2 - 10^1.44*10^3# mol/m^3 K3 - 10^(-3.224)*10^3# mol/m^3 Ke - 10^-11 # mol/m^3 r1 - kf4*C[4]/V - kf4/K34*C[5]^2/(Ct*V) r2 - kf3*C[1]*C[2] - kf3/K33*C[4] r3 - kf2*C[1]^2 - kf2/K32*C[3] r4 - kf5*C[3]/V - kf5/K35*C[5]*C[6]*C[8]/(Ct^2*V) res1 - dC[1] + r2 + r3 # dNO2/dt res2 - dC[2] + r2 # dNO/dt res3 - dC[3] - r3 + r4 # dN2O4/dt res4 - dC[4] - r2 + r1 # dN2O3/dt res5 - dC[5] -2*r1 - r4 + dC[7]# dHNO2/dt res6 - dC[6] - r4 + dC[8] # dHNO3/dt res7 - C[7] - K3*C[5]/C[9] # dNO2-/dt res8 - C[8] - K2*C[6]/C[9] # dNO3-/dt res9 - dC[9] - dC[8] - dC[7] # dCH/dt list(c(res1, res2, res3, res4, res5, res6, res7, res8, res9)) }) } yini - c(0.9, 1.33, 0, 0, 10, 20, 0.001, 22.9, 23) dyini - c(1,1,1,1,1,1,1,1,1) Qm - 4 # kg/h Ceau - (Qm - yini[1]*0.046 - yini[2]*0.03 + yini[3]*0.092 + yini[4]*0.076 + yini[5]*0.062 + yini[6]*0.063)/0.018# mol/m^3 parameters - c(Qm = Qm, Ceau = Ceau) liquide(20,yini, dyini,parameters) z - seq(0, 120, by = 1) # en seconde library(deSolve) out - daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) head(out) plot(out) Thanks Raphaëlle Carraud [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Holt-Winters problem
This post has NOT been accepted by the mailing list yet. This post was updated on Jul 26, 2013; 4:40pm. Hello, Two days ago I started my journey with R, and right now I'm struck on Holtwinters, and have no idea, what I'm doing wrong. So now, I'll show step by step what I've been doing. Actually it is a part of Introducy Time Series with R. 1) I download data from http://www.massey.ac.nz/~pscowper/ts/motororg.dat due to port blockage. Save it as motororg.txt 2) Inserting this code: Motor.dat - read.table(motororg.txt, header = T) attach(Motor.dat) Comp.hw1 - HoltWinters(complaints, beta = 0, gamma = 0) 3) Baang! Error here. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods 4) Actually it has, and I turned it on with attach command? Motor.dat complaints 1 27 2 34 3 31 4 24 5 18 6 19 7 17 8 12 9 26 10 14 11 18 12 33 13 31 14 31 15 19 16 17 17 10 18 25 19 26 20 18 21 18 22 10 23 4 24 20 25 28 26 21 27 18 28 23 29 19 30 16 31 10 32 24 33 11 34 11 35 13 36 27 37 18 38 20 39 21 40 6 41 9 42 29 43 12 44 19 45 14 46 19 47 16 48 23 Can you tell me what I am doing wrong? PS I tried to make complaints.ts as well but I didn't work either. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cannot install XML package - getting cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' error
Hello All, I am not able to install/update XML package, getting the following: install.packages(XML, lib = .libPaths()[3]) trying URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '503 Service Unavailable' Warning in download.packages(pkgs, destdir = tmpd, available = available, : download of package 'XML' failed Also have tried to download the XML_3.98-1.1.zip from http://cran.r-project.org/web/packages/XML/index.html - getting an error: Unable to download XML_3.98-1.1.zip from cran.r-project.org Unable to open this internet site. The requested site is either unavailable or cannot be found. Please try again later. Using R 3.0.1 on windows and able to install/update other packages fine. Please help, Many thanks, Alex = This material has been prepared by individual sales and/or trading personnel and does not constitute investment research. Please follow the attached hyperlink to an important disclaimer: http://www.credit-suisse.com/americas/legal/salestrading = === Please access the attached hyperlink for an important el...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated measures Cox regression ??coxph??
Thanks Terry. Good points. I recalled last night some exchanges on r-sig-mixed-models regarding a reasonable number of 'replications' for the estimation of random effects and it occurred to me that with this study, you will have 0, 1 or 2 events per subject, depending upon the subject risk profiles for hip replacement and length of follow up. It was not clear to me if John's cohort study is retrospective or prospective. If the former, then he will have some insights into the event distribution. If the latter and he needs to pre-specify the analytic method, a GEE style approach using coxph() may make more sense here given the unknowns. Regards, Marc On Jul 26, 2013, at 7:02 AM, Terry Therneau thern...@mayo.edu wrote: Two choices. If this were a linear model, do you like the GEE approach or a mixed effects approach? Assume that subject is a variable containing a per-subject identifier. GEE approach: add + cluster(subject) to the model statement in coxph Mixed models approach: Add + (1|subject) to the model statment in coxme. When only a very few subjects have multiple events, the mixed model (random effect) approach may not be reliable, however. Multiple events per group are the fuel for estimation of the variance of the random effect, and with few of these the profile likelihood of the random effect will be very flat. You can get esssentially a random estimate of the variance of the subject effect. I'm still getting my arms around this issue, and it has taken me a long time. Frailty is an alternate label for random effects when all we have is a random intercept. Multiple labels for the same idea adds confusion, but nothing else. Terry Therneau On 07/25/2013 08:14 PM, Marc Schwartz wrote: On Jul 25, 2013, at 4:45 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote: On Jul 25, 2013, at 2:11 PM, John Sorkinjsor...@grecc.umaryland.edu wrote: Colleagues, Is there any R package that will allow one to perform a repeated measures Cox Proportional Hazards regression? I don't think coxph is set up to handle this type of problem, but I would be happy to know that I am not correct. I am doing a study of time to hip joint replacement. As each person has two hips, a given person can appear in the dataset twice, once for the left hip and once for the right hip, and I need to account for the correlation of data from a single individual. Thank you, John John, See Terry's 'coxme' package: http://cran.r-project.org/web/packages/coxme/index.html When I looked over the description of coxme, I was concerned it was not really designed with this in mind. Looking at Therneau and Grambsch, I thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the analysis question well. There they compared the use of coxph( ...+cluster(ID),,...) withcoxph( ...+strata(ID),,...). Unfortunately I could not tell for sure which one was being described as superio but I think it was the cluster() alternative. I seem to remember there are discussions in the archives. David, I think that you raise a good point. The example in the book (I had to wait to get home to read it) is potentially different however, in that the subject's eye's were randomized to treatment or control, which would seem to suggest comparable baseline characteristics for each pair of eyes, as well as an active intervention on one side where a difference in treatment effect between each eye is being analyzed. It is not clear from John's description above if there is one hip that will be treated versus one as a control and whether the extent of disease at baseline is similar in each pair of hips. Presumably the timing of hip replacements will be staggered at some level, even if there is comparable disease, simply due to post-op recovery time and surgical risk. In cases where the disease between each hip is materially different, that would be another factor to consider, however I would defer to orthopaedic physicians/surgeons from a subject matter expertise consideration. It is possible that the bilateral hip replacement data might be more of a parallel to bilateral breast cancer data, if each breast were to be tracked separately. I have cc'd Terry here, hoping that he might jump in and offer some insights into the pros/cons of using coxme versus coxph with either a cluster or strata based approach, or perhaps even a frailty based approach as in 9.4.1 in the book. Regards, Marc -- David. You also might find the following of interest: http://bjo.bmj.com/content/71/9/645.full.pdf http://www.ncbi.nlm.nih.gov/pubmed/6885 http://www.ncbi.nlm.nih.gov/pubmed/22078901 Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read
Re: [R] Pairwise comparison between columns, logic
Hi Manisha, I didn't run your dataset as I am on the way to college. But, from the error reported, I think it will be due to some missing combinations in one of the dataset. For ex. if you run my previous code without removing CEBPA: ie. mat1- combn(gset[,1],2) lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=patient_id,type=inner);x1[patient_id] } ) #Error: All inputs to rbind.fill must be data.frames So, check whether all the combinations are available in the `lst1`. 2. I will get back to you once I run it. A.K. From: Manisha manishab...@gmail.com To: arun smartpink...@yahoo.com Sent: Friday, July 26, 2013 11:09 AM Subject: Re: Pairwise comparison between columns, logic Hi Arun, I ran the script on a larger dataset and I seem to be running into this following error: Error: All inputs to rbind.fill must be data.frames after the step; lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=firehose_patient_id,type=inner);x1[firehose_patient_id]}) I tried a few things to solve the issue but I am not able to. The format of input files and data are same as in the code you posted. Could you suggest me something? I have attached my input files on which I am trying to run the code. See attached code as well. Minor changes have been made by me. 2. I have another question. From your code how do also capture those pairs of names that donot have any common patient id? Thanks again, -M On Fri, Jul 26, 2013 at 9:29 AM, arun smartpink...@yahoo.com wrote: Hi M, No problem. Regards, Arun - Original Message - From: manishab...@gmail.com manishab...@gmail.com To: smartpink...@yahoo.com Cc: Sent: Friday, July 26, 2013 9:27 AM Subject: Re: Pairwise comparison between columns, logic Thanks for the code. It is elegant and does what I need. Learnt some new things. -M _ Sent from http://r.789695.n4.nabble.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] X matrix deemed to be singular and cbind
Soumitro: Have you read An Introduction to R. If not, do so, as some of your confusion appears related to basic concepts (e.g. of factors) explained there. 1. Presumably your categorical variables are factors, not character. If so, when you cbind() them, you cbind their integer codes, yielding numerical variables. This produces an in incorrect design matrix in fitting -- 1 df per categorical variable instead of 1 less than the number of levels. Also see ?cbind. 2. Produces the correct design matrix, but you are overfitting, presumably because of many different levels for your categorical variables. I suggest you consult with a local statistician to decide how best to handle this, as you seem to be out of your depth with regard to model fitting. ... unless I have misunderstood, of course. Cheers, Bert On Fri, Jul 26, 2013 at 7:55 AM, Soumitro Dey soumitrod...@gmail.com wrote: Hi list, While the X matrix deemed to be singular question has been answered in the list for quite a few times, I have a twist to it. I am using the coxph model for survival analysis on a dataset containing over 160,000 instances and 46 independent variables and I have 2 scenarios: 1. If I use cbind on the 46 independent variables (many of which are categorical), coxph runs without any frills. The problem however is that it won't report which of the categorical variables (e.g. VERY HIGH, HIGH, NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH ***, XLOW ., etc). Is there any way to check this? 2. If I don't use cbind, assuming it'll give me the details I am looking for in the previous step, it throws me the X matrix deemed to be singular, more precisely: X matrix deemed to be singular; variable 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 Could anyone please elaborate on how to get around problem #1 or #2? Thanks! SD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Repeated measures Cox regression ??coxph??
Marc, Thank you for your comments. The data has been previously collected, so the study is a non-concurrent prospective analysis, i.e. retrospective analysis. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Marc Schwartz marc_schwa...@me.com 07/26/13 11:13 AM Thanks Terry. Good points. I recalled last night some exchanges on r-sig-mixed-models regarding a reasonable number of 'replications' for the estimation of random effects and it occurred to me that with this study, you will have 0, 1 or 2 events per subject, depending upon the subject risk profiles for hip replacement and length of follow up. It was not clear to me if John's cohort study is retrospective or prospective. If the former, then he will have some insights into the event distribution. If the latter and he needs to pre-specify the analytic method, a GEE style approach using coxph() may make more sense here given the unknowns. Regards, Marc On Jul 26, 2013, at 7:02 AM, Terry Therneau thern...@mayo.edu wrote: Two choices. If this were a linear model, do you like the GEE approach or a mixed effects approach? Assume that subject is a variable containing a per-subject identifier. GEE approach: add + cluster(subject) to the model statement in coxph Mixed models approach: Add + (1|subject) to the model statment in coxme. When only a very few subjects have multiple events, the mixed model (random effect) approach may not be reliable, however. Multiple events per group are the fuel for estimation of the variance of the random effect, and with few of these the profile likelihood of the random effect will be very flat. You can get esssentially a random estimate of the variance of the subject effect. I'm still getting my arms around this issue, and it has taken me a long time. Frailty is an alternate label for random effects when all we have is a random intercept. Multiple labels for the same idea adds confusion, but nothing else. Terry Therneau On 07/25/2013 08:14 PM, Marc Schwartz wrote: On Jul 25, 2013, at 4:45 PM, David Winsemiusdwinsem...@comcast.net wrote: On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote: On Jul 25, 2013, at 2:11 PM, John Sorkinjsor...@grecc.umaryland.edu wrote: Colleagues, Is there any R package that will allow one to perform a repeated measures Cox Proportional Hazards regression? I don't think coxph is set up to handle this type of problem, but I would be happy to know that I am not correct. I am doing a study of time to hip joint replacement. As each person has two hips, a given person can appear in the dataset twice, once for the left hip and once for the right hip, and I need to account for the correlation of data from a single individual. Thank you, John John, See Terry's 'coxme' package: http://cran.r-project.org/web/packages/coxme/index.html When I looked over the description of coxme, I was concerned it was not really designed with this in mind. Looking at Therneau and Grambsch, I thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the analysis question well. There they compared the use of coxph( ...+cluster(ID),,...) withcoxph( ...+strata(ID),,...). Unfortunately I could not tell for sure which one was being described as superio but I think it was the cluster() alternative. I seem to remember there are discussions in the archives. David, I think that you raise a good point. The example in the book (I had to wait to get home to read it) is potentially different however, in that the subject's eye's were randomized to treatment or control, which would seem to suggest comparable baseline characteristics for each pair of eyes, as well as an active intervention on one side where a difference in treatment effect between each eye is being analyzed. It is not clear from John's description above if there is one hip that will be treated versus one as a control and whether the extent of disease at baseline is similar in each pair of hips. Presumably the timing of hip replacements will be staggered at some level, even if there is comparable disease, simply due to post-op recovery time and surgical risk. In cases where the disease between each hip is materially different, that would be another factor to consider, however I would defer to orthopaedic physicians/surgeons from a subject matter expertise consideration. It is possible that the bilateral hip replacement data might be more of a parallel to bilateral breast cancer data, if each breast were to be tracked separately. I have cc'd Terry here, hoping that he might jump in and offer some insights into the
[R] Hmisc ctable rotate option obsolete?
Dear R-Users and R-Devels, I may have found a deprecated option for the 'latex' function in the Hmisc package. I am working with Hmisc and knitr and tried the following code: \documentclass{article} \usepackage[utf8]{inputenc} \usepackage{amsmath, amssymb} \usepackage{ctable} %\usepackage{booktabs} \begin{document} results = 'asis'= library(Hmisc) iris.t - head(iris) iris.t[seq(2, NROW(iris.t), by = 2),] - format(iris.t[seq(2, NROW(iris.t), by = 2),], scientific = TRUE) texMat - matrix(, ncol = 5, nrow = 6) texMat[seq(2,nrow(texMat), by = 2), ] - scriptsize latex(iris.t, file = '', landscape = TRUE, dcolumn = TRUE, col.just = c('r','c', 'r','c', 'l'), cdec = c(0, 0, 1, 1, 0), na.blank = TRUE, rowname = '', rowlabel = '', cellTexCmd = texMat, ctable = TRUE, cgroup = c('Observations', ''), n.cgroup = c(4, 1), rgroup = c('',''), n.rgroup = c(3, 3), caption = 'iris' ) @ \end{document} Everything runs fine but the 'landscape' option. It says in the help for 'latex' that if option 'ctable' is set to TRUE the 'rotate' option for ctable is used if 'nadscape' is set TRUE. Looking at the ctable documentary (http://texdoc.net/texmf-dist/doc/latex/ctable/ctable.pdf) in section Change History, I get for version v1.07: General: Added option sideways, option rotate now obsolete. Hasn't this been updated in the Hmisc package? Best Simon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Holt-Winters problem
On 26-07-2013, at 16:18, Przemek Gawin arl...@gmail.com wrote: This post has NOT been accepted by the mailing list yet. This post was updated on Jul 26, 2013; 4:40pm. Hello, Two days ago I started my journey with R, and right now I'm struck on Holtwinters, and have no idea, what I'm doing wrong. So now, I'll show step by step what I've been doing. Actually it is a part of Introducy Time Series with R. 1) I download data from http://www.massey.ac.nz/~pscowper/ts/motororg.dat due to port blockage. Save it as motororg.txt 2) Inserting this code: Motor.dat - read.table(motororg.txt, header = T) attach(Motor.dat) Comp.hw1 - HoltWinters(complaints, beta = 0, gamma = 0) 3) Baang! Error here. Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods Your complaints vector is not a timeseries. From the section Arguments of HoltWinters: gamma parameter used for the seasonal component. If set to FALSE, an non-seasonal model is fitted. So use gamma=FALSE. Don't use attach(). Use Motor.dat$complaints in the HoltWinter call or Comp.hw1 - with(Motor.dat, HoltWinters(complaints, beta = 0, gamma = FALSE)) Berend 4) Actually it has, and I turned it on with attach command? Motor.dat complaints 1 27 2 34 3 31 4 24 5 18 6 19 7 17 8 12 9 26 10 14 11 18 12 33 13 31 14 31 15 19 16 17 17 10 18 25 19 26 20 18 21 18 22 10 23 4 24 20 25 28 26 21 27 18 28 23 29 19 30 16 31 10 32 24 33 11 34 11 35 13 36 27 37 18 38 20 39 21 40 6 41 9 42 29 43 12 44 19 45 14 46 19 47 16 48 23 Can you tell me what I am doing wrong? PS I tried to make complaints.ts as well but I didn't work either. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to double integrate a function in R
Hello, R users! I am trying to double integrate the following expression: # expression (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1))) for y2y10. I am trying the following approach # first attempt library(cubature) fun - function(x) { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))} adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8) However, I don't know how to constrain the integration so that y2y10. Any ideas? Tiago __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] transform dataframe with look-up table
Hello First thank you very much to Jean, Bill, Brian and David for the answers and code. I very extremely grateful. I am eventually going to adapt Brian's code with a very minor alteration. If one follows the original syntax End - merge(merge(Start, transformer, by.x=Left, by.y=input, all.x=TRUE), transformer, by.x=Right, by.y=input, all.x=TRUE) the Left variables are listed right and vice versa, which seems odd. Just swapping Left and Right will put it right (pun intentional): End - merge(merge(Start, transformer, by.x=Right, by.y=input, all.x=TRUE), transformer, by.x=Left, by.y=input, all.x=TRUE) Best wishes Juan Antonio El 25/07/2013 18:02, Brian Diggs escribió: On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote: Hello I hope that there is a simple solution to this apparently complex problem. Any help will be much appreciated: I have a dataframe with Left and Right readings (that is, elements in each row are paired). For instance, Left Right [1] 98 [2] 43 [3] 21 [4] 65 [5] 31 [6] 41 [7] 32 [8] 42 [9] 10 8 [10] 9 10 I need to produce a new data frame where the values are transformed according to a look-up table such as inputoutput [1] 5 1 [2]10 1 [3] 4 2 [4] 8 3 [5] 6 5 [6] 5 6 [7] 7 6 [8] 2 7 [9] 9 7 [10]107 [11] 2 8 So [1, ] in the new dataframe would be 7 3. Quite simple so far, but what makes things complicated is the multiple outputs for a single input. In thi s example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output for [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are missing as inputs the output for [5, ] should be NA NA. Thank you very much for your time. Juan Antonio Balbuena merge can handle both of these requirements. First, making the two datasets reproducible: Start - data.frame(Left=c(9,4,2,6,3,4,3,4,10,9), Right=c(8,3,1,5,1,1,2,2,8,10)) transformer - data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2), output=c(1,1,2,3,5,6,6,7,7,7,8)) Then add a marker of the original row numbers so that the work can be checked more easily later (not really needed for the calculations): Start$rownum - seq_len(nrow(Start)) Two merge statements with the columns specified and all.x set to TRUE (to keep cases even without a match): End - merge(merge(Start, transformer, by.x=Left, by.y=input, all.x=TRUE), transformer, by.x=Right, by.y=input, all.x=TRUE) Then we can look at the output, resorted by the original row numbers: End[order(End$rownum),] which gives Right Left rownum output.x output.y 12 89 173 9 34 22 NA 1 12 37 NA 2 12 38 NA 10 56 456 11 56 451 3 13 5 NA NA 4 14 62 NA 5 23 7 NA7 6 23 7 NA8 7 24 827 8 24 828 13 8 10 913 14 8 10 973 15109 1071 16109 1077 -- Dr. Juan A. Balbuena Marine Zoology Unit Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia [1][1]http://www.uv.es/~balbuena P.O. Box 22085 [2][2]http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain [3][3]http://cetus.uv.es/mullpardb/index.html e-mail: [[4]4]j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733 NOTE! For shipments by EXPRESS COURIER use the following street address: C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain. References 1. [5]http://www.uv.es/%7Ebalbuena 2. [6]http://www.uv.es/cavanilles/zoomarin/index.htm 3. [7]http://cetus.uv.es/mullpardb/index.html 4. [8]mailto:j.a.balbu...@uv.es -- Dr. Juan A. Balbuena Marine Zoology Unit Cavanilles Institute of Biodiversity and Evolutionary Biology University of Valencia [9]http://www.uv.es/~balbuena P.O. Box 22085 [10]http://www.uv.es/cavanilles/zoomarin/index.htm 46071 Valencia, Spain [11]http://cetus.uv.es/mullpardb/index.html e-mail: [12]j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733
Re: [R] cannot install XML package - getting cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' error
I have no trouble accessing either of those files. Keep in mind that servers do not have 100% uptime (and the people responsible for uptime do not monitor this forum), and any local network problems you may be having cannot be addressed here. So, in general you need to try more servers, confirm the file you want does exist, verify that your local network is functioning, and practice a little patience. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Grach, Alexander alexander.gr...@credit-suisse.com wrote: Hello All, I am not able to install/update XML package, getting the following: install.packages(XML, lib = .libPaths()[3]) trying URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' Error in download.file(url, destfile, method, mode = wb, ...) : cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' In addition: Warning message: In download.file(url, destfile, method, mode = wb, ...) : cannot open: HTTP status was '503 Service Unavailable' Warning in download.packages(pkgs, destdir = tmpd, available = available, : download of package 'XML' failed Also have tried to download the XML_3.98-1.1.zip from http://cran.r-project.org/web/packages/XML/index.html - getting an error: Unable to download XML_3.98-1.1.zip from cran.r-project.org Unable to open this internet site. The requested site is either unavailable or cannot be found. Please try again later. Using R 3.0.1 on windows and able to install/update other packages fine. Please help, Many thanks, Alex = This material has been prepared by individual sales and/or trading personnel and does not constitute investment research. Please follow the attached hyperlink to an important disclaimer: http://www.credit-suisse.com/americas/legal/salestrading = === Please access the attached hyperlink for an important el...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] X matrix deemed to be singular and cbind
Thanks for your response. You are right - I am new to R and it's terminologies. I will follow up on your suggestions. On Fri, Jul 26, 2013 at 11:22 AM, Bert Gunter gunter.ber...@gene.comwrote: Soumitro: Have you read An Introduction to R. If not, do so, as some of your confusion appears related to basic concepts (e.g. of factors) explained there. 1. Presumably your categorical variables are factors, not character. If so, when you cbind() them, you cbind their integer codes, yielding numerical variables. This produces an in incorrect design matrix in fitting -- 1 df per categorical variable instead of 1 less than the number of levels. Also see ?cbind. 2. Produces the correct design matrix, but you are overfitting, presumably because of many different levels for your categorical variables. I suggest you consult with a local statistician to decide how best to handle this, as you seem to be out of your depth with regard to model fitting. ... unless I have misunderstood, of course. Cheers, Bert On Fri, Jul 26, 2013 at 7:55 AM, Soumitro Dey soumitrod...@gmail.com wrote: Hi list, While the X matrix deemed to be singular question has been answered in the list for quite a few times, I have a twist to it. I am using the coxph model for survival analysis on a dataset containing over 160,000 instances and 46 independent variables and I have 2 scenarios: 1. If I use cbind on the 46 independent variables (many of which are categorical), coxph runs without any frills. The problem however is that it won't report which of the categorical variables (e.g. VERY HIGH, HIGH, NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH ***, XLOW ., etc). Is there any way to check this? 2. If I don't use cbind, assuming it'll give me the details I am looking for in the previous step, it throws me the X matrix deemed to be singular, more precisely: X matrix deemed to be singular; variable 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 Could anyone please elaborate on how to get around problem #1 or #2? Thanks! SD [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maintaining data order in factanal with missing data
When you use complete.cases(), it creates a logical vector that selects cases with no missing values, but it does not change the rownames in the data.frame and those are carried through to the factor scores so you could link them up that way. Alternatively, you could use na.exclude as the na.action in the call to factanal() instead of complete.cases()? That should pad the output with NAs for the cases that have missing data. You have to use the formula version of factanal(): set.seed(42) example - data.frame(x=rnorm(15), y=rnorm(15), z=rnorm(15)) to.na - cbind(sample.int(15, 3), sample.int(3, 3)) example[to.na] - NA out - factanal(~x+y+z, 1, data=example, na.action=na.omit, scores=regression) out$scores # With na.omit the cases with missing values are gone as indicated Factor1 # by the missing row numbers 2 -0.92604879 4 0.10731539 5 -0.24370504 6 0.07357697 7 0.69905895 8 -0.17646575 9 1.58430095 10 -0.35934769 12 1.07671299 13 -1.47487960 14 -0.30235156 15 -0.05816682 out - factanal(~x+y+z, 1, data=example, na.action=na.exclude, scores=regression) out$scores # With na.exclude, the cases are kept out of the analysis but the rows are Factor1 # preserved in the factor scores output 1 NA 2 -0.92604879 3 NA 4 0.10731539 5 -0.24370504 6 0.07357697 7 0.69905895 8 -0.17646575 9 1.58430095 10 -0.35934769 11 NA 12 1.07671299 13 -1.47487960 14 -0.30235156 15 -0.05816682 - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL Petr Sent: Friday, July 26, 2013 9:06 AM To: s00123...@myacu.edu.au; 'Justin Delahunty'; r-help@r-project.org Subject: Re: [R] Maintaining data order in factanal with missing data Hi There are probably better options but merge(data.frame(x=1:154),data.frame(x=names(ab.1.fa[[1]]), y=ab.1.fa[[1]]), all.x=T) gives you data frame with NA when there was missing value in the first data.frame. You probably can automate the process a bit with nrow function. Regards Petr -Original Message- From: Justin Delahunty [mailto:a...@genius.net.au] Sent: Friday, July 26, 2013 3:34 PM To: PIKAL Petr; 'Justin Delahunty'; 'Justin Delahunty'; r-help@r- project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Petr, So sorry, I accidentally attached the complete data set rather than the one with missing values. I've attached the correct file to this email. RE: init.dfs() being local, I hadn't even thought of that. I've been away from OOP for close to 15 years now, so it might be time to revise! The problem I have is that with missing values the list of factor scores returned (ab.w1.fa$factor.scores) does not map onto the originating data frame (ab.w1.df) as it no longer includes the cases which had missing values. So while the original data set for ab.w1.df contains 154 ordered cases, the factor analysis contains only 150. I am seeking a way to map the values derived from the factor analysis (ab.w1.fa$factor.scores) back to their original ordered position, so that these factor score variables may be merged back into the master data frame (ab.df). A unique ID for each case is available ($dmid) which I had thought to use when merging the new variables, however I don't know how to implement this. Thanks for your help, Justin -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Friday, 26 July 2013 10:59 PM To: Justin Delahunty; Justin Delahunty; r-help@r-project.org Subject: RE: [R] Maintaining data order in factanal with missing data Hi Well, the function init.dfs does nothing as all data frames created inside it does not propagate to global environment and there is nothing what the function returns. Tha last line (when used outside a function) gives warnings but there is no sign of error. When head(ab.1.df) dmid g5oab2 g53 g54 g55 g5ovb1 11 1.418932 1.805227 2.791152 3.624116 3.425586 22 2.293907 1.187830 1.611237 1.748526 3.816533 33 2.836536 2.679523 1.279639 2.674986 2.452395 44 1.872259 3.278359 1.785872 2.458315 1.146480 55 1.467195 1.180747 3.564127 3.007682 2.109506 66 3.098512 3.151974 3.969379 3.750571 1.497358 head(ab.2.df) dmid w2oab3 w22 w23 w24 w2ovb1 11 4.831362 5.522764 7.809366 6.969172 7.398385 22 6.706346 4.101742 1.434697 5.266775 5.357641 33 3.653806 2.666885 1.209326 5.125556 4.963374 44 7.221255 7.649152 6.540398 6.648506 2.576081 55 1.848023 5.044314 2.761881 3.307220 1.454234 66 7.606429 4.911766 2.034813 2.638573 2.818834 head(ab.3.df) dmid w3oab3 w3oab4 w3oab7 w3oab8 w3ovb1 11 5.835609 6.108220 6.587721 2.451461 2.785467 22 4.973198 1.196815
Re: [R] Saving multiple rda-files as one rda-file
What you need to ensure is that you have sufficient physical memory for the operations that you want to do. I would suggest at least 3X the size of the object you want to create. If you have RData files that will result (after the rbind) into a 40GB object, then you will need over 100GB of physical memory since the rbind operation will be creating a new object of 40GB from the separate files that total 40GB, so that is 80GB right there. If you then want to do some operations on an object that large, you might be making copies, so you need the memory. Maybe you should consider keeping the data in a relational database and then use the SELECT operator to get just the data you need. Also the aggregation operators might be useful to reduce the size of physical memory that you need. On Fri, Jul 26, 2013 at 11:04 AM, David Winsemius dwinsem...@comcast.netwrote: On Jul 25, 2013, at 7:17 AM, Dark wrote: Hi, Yes maybe I should have been more clear on my problem. I want to append the different data-frames back into one variable ( rbind ) and save it as one R Data file. Indeed. That was the operation I had in mind when I made my suggestions. Perhaps you need to create a set of toy dataframes with similar structure and then the audience can propose solutions. That's the usual process around these parts. -- David. Regards Derk -- View this message in context: http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to double integrate a function in R
On Jul 26, 2013, at 8:44 AM, Tiago V. Pereira wrote: I am trying to double integrate the following expression: # expression (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1))) for y2y10. I am trying the following approach # first attempt library(cubature) fun - function(x) { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))} adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8) However, I don't know how to constrain the integration so that y2y10. Generally incorporating boundaries is accomplished by multiplying the integrand with logical vectors that encapsulate what are effectively two conditions: Perhaps: fun - function(x) { (x[1]x[2])*(x[1]0)* (1/(2*pi))*exp(-x[2]/2)* sqrt((x[1]/(x[2]-x[1])))} That was taking quite a long time and I interrupted it. There were quite a few warnings of the sort 1: In sqrt((x[1]/(x[2] - x[1]))) : NaNs produced 2: In sqrt((x[1]/(x[2] - x[1]))) : NaNs produced Thinking the NaNs might sabotage the integration process, I added a conditional to the section of that expression that was generating the NaNs. I don't really know whether NaN's are excluded from the summation process in adaptIntegrate: fun - function(x) { (x[1]x[2])*(x[1]0)* (1/(2*pi))*exp(-x[2]/2)* if(x[1]x[2]){ 0 }else{ sqrt((x[1]/(x[2]-x[1])) )} } adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6) ) I still didn't have the patience to wait for an answer, but I did plot the function: fun2 - function(x,y) { (xy)*(x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))} persp(outer(0:5, 0:6, fun2) ) So at least the function is finite over most of its domain. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Externalptr class to character class from (web) scrape
I'm hitting a wall. When I use the 'scrape' function from the package 'scrapeR' to get the pagesource from a web page, I do the following: (as an example) website.doc = parse(http://www.google.com;) When I look at it, it seems fine: website.doc[[1]] This seems to have the information I need. Then when I try to get it into a character vector, character.website = as.character(website.doc[[1]]) I get the error: Error in as.vector(x, character) : cannot coerce type 'externalptr' to vector of type 'character' I'm trying very very hard to wrap my head around how to get this external pointer to a character, but after reading many help files, I cannot understand how to do this. Any ideas? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't figure out why short figure won't work
[Sorry I can't quote past messages as I get mail on Nabble and have been told I can't reply through Nabble]. Thanks Kennel for recommended a narrowing range for usr y-limits. That does help quite a bit. But I found a disappointing aspect of the graphics system: When you change height= on the device call you have to change the y-coordinates to keep the same absolute vertical positioning. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple interaction terms in GAMM-model
On Thu, 2013-07-25 at 04:56 -0700, Jeroen wrote: Dear all, I am trying to correlate a variable tau1 to a set of other variables (x1, x2, x3, x4), taking into account an interaction with time ('doy') and place ('region'), and taking into account dependency of data in time per object ID. My dataset looks like: snip / Since the data is dependent in time per objectID, I use a GAMM model with an autocorrelation function. Since each variable x1, x2, etc. is dependent on time and place, I should incorporate this as well. Therefore I am wondering if the following gamm-model is correct for my situation: model - gamm( tau1 ~ te( x1, by= doy ) + te( x1, by= factor( region ) ) + ... + te( x4, by= doy ) + te( x4, by= factor( region ) ) + factor( region ), correlation= corAR1(form= ~ doy|objectID ), na.action= na.omit ). Does anyone know if this is ok? The model looks a little odd - the way you've written it you don't need `te()` smooths. For the interaction with doy, I would try te(x1, doy, bs = c(cr, cc)) assuming x1 and doy are continuous. This specifies cubic splines and cyclic cubic splines respectively for x1 and doy. I use a cyclic spline for doy as we might not expect Jan 1st and Dec 31st to be much different. By separating the interaction between x1 and doy and the one for x1 and region you are saying that the way the effect of x1 varies through the year does not change between regions. Is that reasonable? I think you need something for the objectID - a random effect or a fixed effect will account for differences in mean values of taul between objects. A random effect seems most useful otherwise you'll eat into your degrees of freedom, but you have plenty of those. The correlation part assumes that the residuals follow an AR(1) applied within the objects, with each AR(1) having the same coefficient. Autocorrelation decreases exponentially with lag/separation. It is a reasonable start. Fitting will be much quicker without this. Could you try fitting without and then extract the normalised residuals to check for residual correlation? I hope you have a lot of memory and a lot of spare time available; my experience of fitting similarly sized (though not as complex - I had one *very* long series) was that gamm() used large amounts of RAM (I had 16GB and most was used for a single model) and took many days to fit. You may find the random effect for object fights with the AR(1) so you could end up with an odd fit if it fits at all. You are going to want to turn on verbosity when fitting so you can see how the optimisation is proceeding. Something like require(nlme) ## need a control object ctrl - lmeControl(msVerbose = TRUE, maxIter = 400, msMaxIter = 400, niterEM = 0, ## this is VERY important for gamm()! tolerance = 1e-8, msTol = 1e-8, msMaxEval = 400) then add `control = ctrl` to the `gamm()` call. I would approach the expecting the model to fail to fit so be prepared to simplify it etc. Good luck, G Or should I use a model which also includes terms like te( x1 ) + ... + te( x4 ). And is the correlation function correct? Thanks so much!! Jeroen -- View this message in context: http://r.789695.n4.nabble.com/Multiple-interaction-terms-in-GAMM-model-tp4672297.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gavin Simpson, PhD [t] +1 306 337 8863 Adjunct Professor, Department of Biology[f] +1 306 337 2410 Institute of Environmental Change Society [e] gavin.simp...@uregina.ca 523 Research and Innovation Centre [tw] @ucfagls University of Regina Regina, SK S4S 0A2, Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't figure out why short figure won't work
lattice graphics allow you to specify units in cm or inches, and then the absolute vertical positioning can be controlled. On Fri, Jul 26, 2013 at 12:58 PM, Frank Harrell f.harr...@vanderbilt.eduwrote: [Sorry I can't quote past messages as I get mail on Nabble and have been told I can't reply through Nabble]. Thanks Kennel for recommended a narrowing range for usr y-limits. That does help quite a bit. But I found a disappointing aspect of the graphics system: When you change height= on the device call you have to change the y-coordinates to keep the same absolute vertical positioning. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to double integrate a function in R
On Jul 26, 2013, at 9:33 AM, David Winsemius wrote: fun2 - function(x,y) { (xy)*(x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))} persp(outer(0:5, 0:6, fun2) ) There does seem to be some potential pathology at the edges of the range, Restricting it to x = 0.03 removes most of that concern. fun2 - function(x,y) { (xy)*(x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))} persp(outer(seq(0.01,5,by=.01), seq(.02,6,by=.01), fun2) ,ticktype=detailed) fun - function(x) { (x[1]x[2])*(x[1]0)* (1/(2*pi))*exp(-x[2]/2)*if(x[1]x[2]){0}else{ sqrt((x[1]/(x[2]-x[1])) )}} adaptIntegrate(fun, lower = c(0.03,0.03), upper =c(5, 6), tol=1e-2 ) $integral [1] 0.7605703 $error [1] 0.00760384 $functionEvaluations [1] 190859 $returnCode [1] 0 I tried decreasing the tolerance to 1e-3 but the wait exceeds the patience I have allocated to the problem. -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Externalptr class to character class from (web) scrape
On 26/07/2013 12:43 PM, Nick McClure wrote: I'm hitting a wall. When I use the 'scrape' function from the package 'scrapeR' to get the pagesource from a web page, I do the following: (as an example) website.doc = parse(http://www.google.com;) When I look at it, it seems fine: website.doc[[1]] This seems to have the information I need. Then when I try to get it into a character vector, character.website = as.character(website.doc[[1]]) I get the error: Error in as.vector(x, character) : cannot coerce type 'externalptr' to vector of type 'character' I'm trying very very hard to wrap my head around how to get this external pointer to a character, but after reading many help files, I cannot understand how to do this. Any ideas? You should use str() in cases like this. When I look at str(website.doc[[1]]) (after producing website.doc with scrape(), not parse()), I see str(website.doc[[1]]) Classes 'HTMLInternalDocument', 'HTMLInternalDocument', 'XMLInternalDocument', 'XMLAbstractDocument' externalptr - attr(*, headers)= Named chr [1:2] HTMLHEADmeta http-equiv=\content-type\ content=\text/html;charset=utf-8\\nTITLE302 Moved/TITLE/HEADBODY\nH1| __truncated__ /BODY/HTML ..- attr(*, names)= chr [1:2] HTMLHEADmeta http-equiv=\content-type\ content=\text/html;charset=utf-8\\nTITLE302 Moved/TITLE/HEADBODY\nH1| __truncated__ /BODY/HTML So it is an external pointer with a number of classes. One or more of those will have a print method. methods(print) will list all the print methods, and I see there's a (hidden) print.XMLInternalDocument method somewhere. Then getAnywhere(print.XMLInternalDocument) A single object matching ‘print.XMLInternalDocument’ was found It was found in the following places registered S3 method for print from namespace XML namespace:XML with value function (x, ...) { cat(as(x, character), \n) } environment: namespace:XML shows that the as() generic should work, even though as.character() doesn't, and indeed as(website.doc[[1]], character) does display something. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Externalptr class to character class from (web) scrape
By the way, here's a quicker (but slightly more dangerous) way to find the print.XMLInternalDocument function: just call debug(print) before printing the object. Two steps take you to the method, and let you see what it's doing. The danger comes because now print() will always trigger the debugger, which can be a little confusing! Remember undebug(print) at the end. Duncan Murdoch On 26/07/2013 1:26 PM, Duncan Murdoch wrote: On 26/07/2013 12:43 PM, Nick McClure wrote: I'm hitting a wall. When I use the 'scrape' function from the package 'scrapeR' to get the pagesource from a web page, I do the following: (as an example) website.doc = parse(http://www.google.com;) When I look at it, it seems fine: website.doc[[1]] This seems to have the information I need. Then when I try to get it into a character vector, character.website = as.character(website.doc[[1]]) I get the error: Error in as.vector(x, character) : cannot coerce type 'externalptr' to vector of type 'character' I'm trying very very hard to wrap my head around how to get this external pointer to a character, but after reading many help files, I cannot understand how to do this. Any ideas? You should use str() in cases like this. When I look at str(website.doc[[1]]) (after producing website.doc with scrape(), not parse()), I see str(website.doc[[1]]) Classes 'HTMLInternalDocument', 'HTMLInternalDocument', 'XMLInternalDocument', 'XMLAbstractDocument' externalptr - attr(*, headers)= Named chr [1:2] HTMLHEADmeta http-equiv=\content-type\ content=\text/html;charset=utf-8\\nTITLE302 Moved/TITLE/HEADBODY\nH1| __truncated__ /BODY/HTML ..- attr(*, names)= chr [1:2] HTMLHEADmeta http-equiv=\content-type\ content=\text/html;charset=utf-8\\nTITLE302 Moved/TITLE/HEADBODY\nH1| __truncated__ /BODY/HTML So it is an external pointer with a number of classes. One or more of those will have a print method. methods(print) will list all the print methods, and I see there's a (hidden) print.XMLInternalDocument method somewhere. Then getAnywhere(print.XMLInternalDocument) A single object matching ‘print.XMLInternalDocument’ was found It was found in the following places registered S3 method for print from namespace XML namespace:XML with value function (x, ...) { cat(as(x, character), \n) } environment: namespace:XML shows that the as() generic should work, even though as.character() doesn't, and indeed as(website.doc[[1]], character) does display something. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: Line starting ' ...' is malformed!
A DESCRIPTION file begins with 0xFFFE and $ file DESCRIPTION DESCRIPTION: Little-endian UTF-16 Unicode text, with CRLF, CR line terminators I think it was created on Windows. In R (2,15,1 on Debian GNU/Linux), using roxygen2, I get roxygenize(../GitHub/mice) Error: Line starting '��P ...' is malformed! Enter a frame number, or 0 to exit 1: roxygenize(../GitHub/mice) 2: read.description(DESCRIPTION) 3: read.dcf(file) Selection: 3 Called from: read.description(DESCRIPTION) Browse[1] Q I'm not sure if the first 2 characters after line starting ', which are octal 377, 376, will survive email; I stripped them out of the subject line. The files (DESCRIPTION isn't the only one) have also caused trouble for git (even on Windows 7), since it thinks they are binary. Any advice about what to do? I'm reluctant to change the format of the files because it's not my package. Ross Boylan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pairwise comparison between columns, logic
Hi, Using the example code without removing CEBPA: gset- read.table(Names.txt,header=TRUE,stringsAsFactors=FALSE) temp1- read.table(Data.txt,header=TRUE,stringsAsFactors=FALSE) lst1-split(temp1,temp1$Names) mat1-combn(gset[,1],2) library(plyr) lst2-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]}) lst3-lapply(lst2[lapply(lst2,length)==2],function(x) {x1- join_all(x,by=patient_id,type=inner);x2-x1[patient_id];row.names(x2)-if(nrow(x1)!=0) paste(x1[,1],x1[,3],1:nrow(x1),sep=_) else NULL;x2 }) Reduce(rbind,lst3) # patient_id #DNMT3A_FLT3_1 LAML-AB-2811-TB #DNMT3A_FLT3_2 LAML-AB-2816-TB #DNMT3A_FLT3_3 LAML-AB-2818-TB #DNMT3A_IDH1_1 LAML-AB-2802-TB #DNMT3A_IDH1_2 LAML-AB-2822-TB #DNMT3A_NPM1_1 LAML-AB-2802-TB #DNMT3A_NPM1_2 LAML-AB-2809-TB #DNMT3A_NPM1_3 LAML-AB-2811-TB #DNMT3A_NPM1_4 LAML-AB-2816-TB #DNMT3A_NRAS_1 LAML-AB-2816-TB #FLT3_NPM1_1 LAML-AB-2811-TB #FLT3_NPM1_2 LAML-AB-2812-TB #FLT3_NPM1_3 LAML-AB-2816-TB #FLT3_NRAS_1 LAML-AB-2816-TB #IDH1_NPM1_1 LAML-AB-2802-TB #NPM1_NRAS_1 LAML-AB-2816-TB From your original dataset: gset- read.table(SampleGenes.txt,header=TRUE,stringsAsFactors=FALSE) temp0- read.table(LAML-TB.final_analysis_set.txt,header=TRUE,stringsAsFactors=FALSE,sep=\t) temp1- temp0[,c(Hugo_Symbol,firehose_patient_id)] str(temp1) #'data.frame': 2221 obs. of 2 variables: # $ Hugo_Symbol : chr TBX15 TCHHL1 DNMT3A IDH1 ... # $ firehose_patient_id: chr LAML-AB-2802-TB LAML-AB-2802-TB LAML-AB-2802-TB LAML-AB-2802-TB ... lst1-split(temp1,temp1$Hugo_Symbol) length(lst1) #[1] 1607 mat1-combn(gset[,1],2) # Generate all lst2-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]}) length(lst2) #[1] 105 lst3-lapply(lst2[lapply(lst2,length)==2],function(x) {x1- join_all(x,by=firehose_patient_id,type=inner);x2-x1[firehose_patient_id];row.names(x2)-if(nrow(x1)!=0) paste(x1[,1],x1[,3],1:nrow(x1),sep=_) else NULL;x2 }) res-Reduce(rbind,lst3) nrow(res) #[1] 234 head(res) # firehose_patient_id #NPM1_FLT3_1 LAML-AB-2811-TB #NPM1_FLT3_2 LAML-AB-2812-TB #NPM1_FLT3_3 LAML-AB-2816-TB #NPM1_FLT3_4 LAML-AB-2818-TB #NPM1_FLT3_5 LAML-AB-2825-TB #NPM1_FLT3_6 LAML-AB-2836-TB Regarding your second question: setdiff(gset[,1],unique(temp1[,1])) # CEBPA was not found in the temp1[,1] #[1] CEBPA mat2- combn(gset[-5,1],2) vec1- apply(mat2,2,paste,collapse=_) vec2-unique(gsub((.*\\_.*)\\_.*,\\1,row.names(res))) setdiff(vec1,vec2) #[1] NPM1_TP53 NPM1_EZH2 NPM1_RUNX1 NPM1_ASXL1 NPM1_KDM6A #[6] FLT3_TP53 FLT3_EZH2 FLT3_KRAS FLT3_ASXL1 FLT3_KDM6A #[11] IDH1_TP53 IDH1_KRAS NRAS_IDH2 NRAS_KRAS NRAS_ASXL1 #[16] NRAS_KDM6A TP53_EZH2 TP53_IDH2 TP53_RUNX1 TP53_KRAS #[21] TP53_WT1 TP53_ASXL1 TP53_KDM6A EZH2_IDH2 EZH2_WT1 #[26] EZH2_ASXL1 EZH2_KDM6A IDH2_TET2 IDH2_KDM6A RUNX1_KDM6A #[31] KRAS_WT1 KRAS_KDM6A WT1_ASXL1 WT1_TET2 WT1_KDM6A #[36] ASXL1_TET2 ASXL1_KDM6A TET2_KDM6A A.K. - Original Message - From: arun smartpink...@yahoo.com To: Manisha manishab...@gmail.com Cc: R help r-help@r-project.org Sent: Friday, July 26, 2013 11:18 AM Subject: Re: Pairwise comparison between columns, logic Hi Manisha, I didn't run your dataset as I am on the way to college. But, from the error reported, I think it will be due to some missing combinations in one of the dataset. For ex. if you run my previous code without removing CEBPA: ie. mat1- combn(gset[,1],2) lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=patient_id,type=inner);x1[patient_id] } ) #Error: All inputs to rbind.fill must be data.frames So, check whether all the combinations are available in the `lst1`. 2. I will get back to you once I run it. A.K. From: Manisha manishab...@gmail.com To: arun smartpink...@yahoo.com Sent: Friday, July 26, 2013 11:09 AM Subject: Re: Pairwise comparison between columns, logic Hi Arun, I ran the script on a larger dataset and I seem to be running into this following error: Error: All inputs to rbind.fill must be data.frames after the step; lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=firehose_patient_id,type=inner);x1[firehose_patient_id]}) I tried a few things to solve the issue but I am not able to. The format of input files and data are same as in the code you posted. Could you suggest me something? I have attached my input files on which I am trying to run the code. See attached code as well. Minor changes have been made by me. 2. I have another question. From your code how do also capture those pairs of names that donot have any common patient id? Thanks again, -M On Fri, Jul 26, 2013 at 9:29 AM, arun smartpink...@yahoo.com wrote: Hi M, No problem. Regards, Arun - Original Message - From: manishab...@gmail.com manishab...@gmail.com To: smartpink...@yahoo.com Cc: Sent: Friday, July 26, 2013
Re: [R] Pairwise comparison between columns, logic
Hi Arun, Many thanks for following this through. Based on your earlier suggestion with CEBPA, I also edited the code and now it seems work (I removed the non exiting names). I looked into the current code that you have sent below and I learnt a few more things. So thank you again. Your skills in R is very good. Could you suggest some resources for advance R programming. I am fairly comfortable with R, but need to keep abreast with the latest R libraries that can make life a lot simpler. Thanks again, -M On Fri, Jul 26, 2013 at 2:20 PM, arun kirshna [via R] ml-node+s789695n4672460...@n4.nabble.com wrote: Hi, Using the example code without removing CEBPA: gset- read.table(Names.txt,header=TRUE,stringsAsFactors=FALSE) temp1- read.table(Data.txt,header=TRUE,stringsAsFactors=FALSE) lst1-split(temp1,temp1$Names) mat1-combn(gset[,1],2) library(plyr) lst2-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]}) lst3-lapply(lst2[lapply(lst2,length)==2],function(x) {x1- join_all(x,by=patient_id,type=inner);x2-x1[patient_id];row.names(x2)-if(nrow(x1)!=0) paste(x1[,1],x1[,3],1:nrow(x1),sep=_) else NULL;x2 }) Reduce(rbind,lst3) # patient_id #DNMT3A_FLT3_1 LAML-AB-2811-TB #DNMT3A_FLT3_2 LAML-AB-2816-TB #DNMT3A_FLT3_3 LAML-AB-2818-TB #DNMT3A_IDH1_1 LAML-AB-2802-TB #DNMT3A_IDH1_2 LAML-AB-2822-TB #DNMT3A_NPM1_1 LAML-AB-2802-TB #DNMT3A_NPM1_2 LAML-AB-2809-TB #DNMT3A_NPM1_3 LAML-AB-2811-TB #DNMT3A_NPM1_4 LAML-AB-2816-TB #DNMT3A_NRAS_1 LAML-AB-2816-TB #FLT3_NPM1_1 LAML-AB-2811-TB #FLT3_NPM1_2 LAML-AB-2812-TB #FLT3_NPM1_3 LAML-AB-2816-TB #FLT3_NRAS_1 LAML-AB-2816-TB #IDH1_NPM1_1 LAML-AB-2802-TB #NPM1_NRAS_1 LAML-AB-2816-TB From your original dataset: gset- read.table(SampleGenes.txt,header=TRUE,stringsAsFactors=FALSE) temp0- read.table(LAML-TB.final_analysis_set.txt,header=TRUE,stringsAsFactors=FALSE,sep=\t) temp1- temp0[,c(Hugo_Symbol,firehose_patient_id)] str(temp1) #'data.frame':2221 obs. of 2 variables: # $ Hugo_Symbol: chr TBX15 TCHHL1 DNMT3A IDH1 ... # $ firehose_patient_id: chr LAML-AB-2802-TB LAML-AB-2802-TB LAML-AB-2802-TB LAML-AB-2802-TB ... lst1-split(temp1,temp1$Hugo_Symbol) length(lst1) #[1] 1607 mat1-combn(gset[,1],2) # Generate all lst2-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]}) length(lst2) #[1] 105 lst3-lapply(lst2[lapply(lst2,length)==2],function(x) {x1- join_all(x,by=firehose_patient_id,type=inner);x2-x1[firehose_patient_id];row.names(x2)-if(nrow(x1)!=0) paste(x1[,1],x1[,3],1:nrow(x1),sep=_) else NULL;x2 }) res-Reduce(rbind,lst3) nrow(res) #[1] 234 head(res) #firehose_patient_id #NPM1_FLT3_1 LAML-AB-2811-TB #NPM1_FLT3_2 LAML-AB-2812-TB #NPM1_FLT3_3 LAML-AB-2816-TB #NPM1_FLT3_4 LAML-AB-2818-TB #NPM1_FLT3_5 LAML-AB-2825-TB #NPM1_FLT3_6 LAML-AB-2836-TB Regarding your second question: setdiff(gset[,1],unique(temp1[,1])) # CEBPA was not found in the temp1[,1] #[1] CEBPA mat2- combn(gset[-5,1],2) vec1- apply(mat2,2,paste,collapse=_) vec2-unique(gsub((.*\\_.*)\\_.*,\\1,row.names(res))) setdiff(vec1,vec2) #[1] NPM1_TP53 NPM1_EZH2 NPM1_RUNX1 NPM1_ASXL1 NPM1_KDM6A #[6] FLT3_TP53 FLT3_EZH2 FLT3_KRAS FLT3_ASXL1 FLT3_KDM6A #[11] IDH1_TP53 IDH1_KRAS NRAS_IDH2 NRAS_KRAS NRAS_ASXL1 #[16] NRAS_KDM6A TP53_EZH2 TP53_IDH2 TP53_RUNX1 TP53_KRAS #[21] TP53_WT1TP53_ASXL1 TP53_KDM6A EZH2_IDH2 EZH2_WT1 #[26] EZH2_ASXL1 EZH2_KDM6A IDH2_TET2 IDH2_KDM6A RUNX1_KDM6A #[31] KRAS_WT1KRAS_KDM6A WT1_ASXL1 WT1_TET2WT1_KDM6A #[36] ASXL1_TET2 ASXL1_KDM6A TET2_KDM6A A.K. - Original Message - From: arun [hidden email]http://user/SendEmail.jtp?type=nodenode=4672460i=0 To: Manisha [hidden email]http://user/SendEmail.jtp?type=nodenode=4672460i=1 Cc: R help [hidden email]http://user/SendEmail.jtp?type=nodenode=4672460i=2 Sent: Friday, July 26, 2013 11:18 AM Subject: Re: Pairwise comparison between columns, logic Hi Manisha, I didn't run your dataset as I am on the way to college. But, from the error reported, I think it will be due to some missing combinations in one of the dataset. For ex. if you run my previous code without removing CEBPA: ie. mat1- combn(gset[,1],2) lst2-lapply(split(mat1,col(mat1)),function(x) {x1-join_all(lst1[x],by=patient_id,type=inner);x1[patient_id] } ) #Error: All inputs to rbind.fill must be data.frames So, check whether all the combinations are available in the `lst1`. 2. I will get back to you once I run it. A.K. From: Manisha [hidden email]http://user/SendEmail.jtp?type=nodenode=4672460i=3 To: arun [hidden email]http://user/SendEmail.jtp?type=nodenode=4672460i=4 Sent: Friday, July 26, 2013 11:09 AM Subject: Re: Pairwise comparison between columns, logic Hi Arun, I ran the script on a larger dataset and I seem to
Re: [R] Duplicated function with conditional statement
On 25.07.2013 21:05, vanessa van der vaart wrote: Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample',' sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. According to your description: tt$newcolumn - as.integer(duplicated(tt$product) tt$response==buy) which is different from what you show us below, where I cannot derive any systematic rule from. Uwe Ligges but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 11 10 4 buy 4 0 can somebody help me? any help will be appreciated. I am new in this mailing list, so forgive me in advance, If I did not ask the question appropriately. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SpatialPolygonsDataFrame and unique()
Thank you very much for the help, that's what I was looking for! Also my apologies for sending an html, I thought I turned it off (hopefully this time works). Best regards, Nicola __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] add different regression lines for groups on ggplot
Hey All, I need to apply different regression lines to different group on my ggplot, and here is the code I use: qplot(x=Var1,y=Var2,data=df,color=SiteID,group=SiteID)+geom_point()+geom_smooth(method='lm',formula=log(y)~I(1/x),se=FALSE,size=2) However the regression for different groups is as below: AL1/AL2: log(y)~I(1/x) AL3: log(y)~log(x) How can I apply each regression equation on the same ggplot? Also I have an issue that if I use the code above, the regression lines are not overlapped on top of my data points. Thanks for your help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] readTiff - Sorry can't handle images with 32-bit samples
I tried using readTiff() and got the error message Sorry can't handle images with 32-bit samples line of code x - readTiff(C:/Users/550062/Desktop/Data/example1.tif) So far I have not had any luck finding this error message on google. Any guess at what it means and how to get the code to work? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicated function with conditional statement
On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote: On 25.07.2013 21:05, vanessa van der vaart wrote: Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample',' sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. According to your description: Agree that the description did not match the output. I tried to match the output using a rule that could be expressed as: if( a buy- associated product value precedes the current product value){1}else{0} -- David. tt$newcolumn - as.integer(duplicated(tt$product) tt$response==buy) which is different from what you show us below, where I cannot derive any systematic rule from. Uwe Ligges but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 11 10 4 buy 4 0 can somebody help me? any help will be appreciated. I am new in this mailing list, so forgive me in advance, If I did not ask the question appropriately. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicated function with conditional statement
On Jul 26, 2013, at 2:06 PM, David Winsemius wrote: On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote: On 25.07.2013 21:05, vanessa van der vaart wrote: Hi everybody,, I have a question about R function duplicated(). I have spent days try to figure this out,but I cant find any solution yet. I hope somebody can help me.. this is my data: subj=c(1,1,1,2,2,3,3,3,4,4) response=c('sample','sample','buy','sample','buy','sample',' sample','buy','sample','buy') product=c(1,2,3,2,2,3,2,1,1,4) tt=data.frame(subj, response, product) the data look like this: subj response product 1 1 sample 1 2 1 sample 2 3 1 buy 3 4 2 sample 2 5 2 buy 2 6 3 sample 3 7 3 sample 2 8 3 buy 1 9 4 sample 1 10 4 buy4 I want to create new column based on the value on response and product column. if the value on product is duplicated, then the value on new column is 1, otherwise is 0. According to your description: Agree that the description did not match the output. I tried to match the output using a rule that could be expressed as: if( a buy- associated product value precedes the current product value){1}else{0} So this delivers the specified output: tt$rown - rownames(tt) as.numeric ( apply(tt, 1, function(x) { x['product'] %in% tt[ rownames(tt) x['rown'] tt$response == buy, product] } ) ) # [1] 0 0 0 0 0 1 1 0 1 0 -- David. tt$newcolumn - as.integer(duplicated(tt$product) tt$response==buy) which is different from what you show us below, where I cannot derive any systematic rule from. Uwe Ligges but I want to add conditional statement that the value on product column will only be considered as duplicated if the value on response column is 'buy'. for illustration, the table should look like this: subj response product newcolumn 1 1 sample 1 0 2 1 sample 2 0 3 1 buy 3 0 4 2 sample 2 0 5 2 buy 2 0 6 3 sample 3 1 7 3 sample 2 1 8 3 buy 1 0 9 4 sample 11 10 4 buy 4 0 can somebody help me? any help will be appreciated. I am new in this mailing list, so forgive me in advance, If I did not ask the question appropriately. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readTiff - Sorry can't handle images with 32-bit samples
Try reading a different file. The error message says the existing code cannot read that file. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. wwreith reith_will...@bah.com wrote: I tried using readTiff() and got the error message Sorry can't handle images with 32-bit samples line of code x - readTiff(C:/Users/550062/Desktop/Data/example1.tif) So far I have not had any luck finding this error message on google. Any guess at what it means and how to get the code to work? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readTiff - Sorry can't handle images with 32-bit samples
Disclaimer: I haven't seen your tif file and I know nothing about readTiff... but here go some general comments. TIF files can use different bit depths (number of bits to store each pixel (or each color for each pixel). Most common software outputs 8- or 16-bits, but your file probably has a higher bit depth of 32 bits per sample. Apparently readTiff cannot handle such bit depth. You may need to convert the 32-bit delth file(s) into 16-bit depth (or whatever readTiff can handle). My suggestion would be to look at ImageMagick, but you may also be able to use some image editing applications to do that, Peter On Fri, Jul 26, 2013 at 1:51 PM, wwreith reith_will...@bah.com wrote: I tried using readTiff() and got the error message Sorry can't handle images with 32-bit samples line of code x - readTiff(C:/Users/550062/Desktop/Data/example1.tif) So far I have not had any luck finding this error message on google. Any guess at what it means and how to get the code to work? Thanks! -- View this message in context: http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it possible (/reasonable) to write an as.RefClass function for R list class?
Hello all. I am interested in creating a mutable list object in R. Ideally I would do it through something like this: x - list(a = list(1:4), b = miao, c = 4:1) x_RC - as.RefClass(x) attr(x_RC[[2]], I am) - string x_RC[[3]] - x_RC[[3]]+1 x_new - as.list(x_RC) Is that reasonably possible to create? Or does that make little/no-sense? Thanks. Contact Details:--- Contact me: tal.gal...@gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible (/reasonable) to write an as.RefClass function for R list class?
Reference classes are implemented with environment objects, which are mutable. Once you convert to list, the converted object is not mutable, since there is no such thing as a mutable list. So I would say your request is not possible. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Tal Galili tal.gal...@gmail.com wrote: Hello all. I am interested in creating a mutable list object in R. Ideally I would do it through something like this: x - list(a = list(1:4), b = miao, c = 4:1) x_RC - as.RefClass(x) attr(x_RC[[2]], I am) - string x_RC[[3]] - x_RC[[3]]+1 x_new - as.list(x_RC) Is that reasonably possible to create? Or does that make little/no-sense? Thanks. Contact Details:--- Contact me: tal.gal...@gmail.com | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Boxcox transformation error
Hello I'm trying to run the script below for making a boxcox transformation of some variables contained on an excel file, but i can't get it. I ever have the same message : error : $ operator is invalid for atomic vectors One of the names of the variables is Ec30 and it's the variable I put as example. require(RODBC) require(fBasics) require(e1071) require(MASS) setwd(C:\\estadisticafijo) dir() temp=odbcConnectExcel(prueba35r) rs-sqlFetch(temp,Hoja1) close(temp) fix(rs) boxcox(rs$Ec30) Thaks a lot for your help. Miller Ruiz Assistant Research Cenipalma Colombia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GGplot 2 – cannot get histogram and box plot axis to match.
Dear Thierry, Thank you for your help. My code now works the way I wanted it to. To other readers out there, I think the reason why the two functions work differently is because extend_limits only ensures that the axis includes the coordinates supplied, and does not guarantee that limits will be the coordinates you supplied to the function (they might extend further in some graphs like the heatmap) Thierry, if any of the above is wrong or incomplete, feel free to correct me or add more details. John Clow UCSB Student From: ONKELINX, Thierry [thierry.onkel...@inbo.be] Sent: Friday, July 26, 2013 1:03 AM To: John W. Clow; r-help@r-project.org Subject: RE: GGplot 2 – cannot get histogram and box plot axis to match. Dear John, Use xlim() and ylim() instead of expand_limits() library(ggplot2) #sample data from ggplot2 data(Cars93, package = MASS) dataSet - Cars93 #variables to calculate the range to extend the axis dataVector - unlist(dataSet[,MPG.city]) dataRange - diff(range(dataSet$MPG.city)) graphRange - c(min(dataSet$MPG.city) - dataRange/5, max(dataSet$MPG.city) + dataRange/5) #making the box plot theBoxPlot - ggplot(dataSet,aes_string(x = MPG.city, y = MPG.city)) theBoxPlot - theBoxPlot + geom_boxplot() + coord_flip() + ylim(limits = graphRange) print(theBoxPlot) #making the histogram thePlot - ggplot(dataSet,aes_string(x = MPG.city)) thePlot - thePlot + geom_histogram() + xlim(graphRange) print(thePlot) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie Kwaliteitszorg / team Biometrics Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens John W. Clow Verzonden: donderdag 25 juli 2013 20:03 Aan: r-help@r-project.org Onderwerp: [R] GGplot 2 – cannot get histogram and box plot axis to match. Problem: I am trying to get the histogram and box plot x axis to match. I’ve tried using the expand_limits function to make the axis match but that didn’t make the axis match. The histogram’s axis are still consistently larger than the ones for the box plot (though the function did help). Does anyone have a suggestion as to what I should do instead? Background: I am building a Shiny app that displays a histogram below a bar chart for a set of data that a user uploads to the app. If you want to see the app, go here http://spark.rstudio.com/jclow/Archive20130725HistogramApp/ To run the app, select “Use Sample Data” , then select “MPG.city” under choose a column, then finally select box plot. Sample code: Below is a snippet of my code to demonstrate the problems I have. library(ggplot2) #sample data from ggplot2 data(Cars93, package = MASS) dataSet - Cars93 #variables to calculate the range to extend the axis dataVector - unlist(dataSet[,MPG.city]) dataRange - max(dataVector) - min(dataVector) graphRange - c(min(dataVector) - dataRange/5, max(dataVector) + dataRange/5) #making the box plot theBoxPlot - ggplot(dataSet,aes_string(x = MPG.city,y = MPG.city)) theBoxPlot = theBoxPlot + geom_boxplot() + expand_limits(y= graphRange) + coord_flip() print(theBoxPlot) #making the histogram thePlot - ggplot(dataSet,aes_string(x = MPG.city)) thePlot -thePlot + geom_histogram() + expand_limits(x= graphRange) print(thePlot) Thank you for taking the time to read this. John Clow UCSB Student __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
[R] matching columns of model matrix to those in original data.frame
What is a reliable way to go from a column of a model matrix back to the column (or columns) of the original data source used to make the model matrix? I can come up with a method that seems to work, but I don't see guarantees in the documentation that it will. In particular, does the order of the term.labels match the order of columns for factors in a terms object? The documentation says the model.matrix assign attribute uses the ordering of terms.labels. If anyone can tell me if this approach is reliable, or of one that is, I would appreciate it. Ross Boylan Proposed function and a little example follow. # return a vector v such that data[,v[i]] contributed to mm[,i] # mm = model matrix produced by # form = formula # data = data reverse.map - function(mm, form, data){ tt - terms(form, data=data) ttf - attr(tt, factors) mmi - attr(mm, assign) # this depends on assign using same order as columns of factors # entries in mmi that are 0 (the intercept) are silently dropped ttf2 - ttf[,mmi] # take the first row that contributes r - apply(ttf2, 2, function(is) rownames(ttf)[is 0][1]) match(r, colnames(data)) } ### experiment with mapping model matrix to original columns df - sp2b[sample(nrow(sp2b), 8), c(pEthnic, ethnic_sg, rac_gay)] form - ~pEthnic+ethnic_sg*rac_gay mm - model.matrix(form, df) tt - terms(form, data=df) ttf - attr(tt, factors) mmi - attr(mm, assign) df pEthnic ethnic_sg rac_gay 1366 Afr Amer Afr Amer3.25 3052 Afr Amer Afr Amer1.75 3012 Latino Afr Amer2.00 369 Afr Amer Asian/PI2.00 529 White Asian/PI2.00 194 Asian/PI Asian/PI3.25 126 White Asian/PI2.25 2147 LatinoLatino2.75 colnames(mm) [1] (Intercept) pEthnicAsian/PI [3] pEthnicLatino pEthnicOther [5] pEthnicWhite ethnic_sgAsian/PI [7] ethnic_sgLatino rac_gay [9] ethnic_sgAsian/PI:rac_gay ethnic_sgLatino:rac_gay ttf # term factors pEthnic ethnic_sg rac_gay ethnic_sg:rac_gay pEthnic 1 0 0 0 ethnic_sg 0 1 0 1 rac_gay 0 0 1 1 mmi #model matrix assign [1] 0 1 1 1 1 2 2 3 4 4 reverse.map(mm, form, df) [1] 1 1 1 1 2 2 3 2 2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Duplicated function with conditional statement
On some slightly different datasets: tt1-structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L), .Label = c(buy, sample), class = factor), product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 4, 2, 2, 4, 5, 5, 4, 3, 4, 5, 4, 2)), .Names = c(subj, response, product ), class = data.frame, row.names = c(NA, 22L)) tt2- structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c(buy, sample), class = factor), product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 1, 4, 5, 1, 4, 2, 3, 3, 2, 5, 3, 4)), .Names = c(subj, response, product ), class = data.frame, row.names = c(NA, 22L)) tt3- structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 2L), .Label = c(buy, sample), class = factor), product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 1, 1, 3, 5, 2, 2, 2, 2, 4, 3, 2, 5)), .Names = c(subj, response, product ), class = data.frame, row.names = c(NA, 22L)) #Tried David's solution: tt1$rown - rownames(tt1) as.numeric ( apply(tt1, 1, function(x) { x['product'] %in% tt1[ rownames(tt1) x['rown'] tt1$response == buy, product] } ) ) #gave inconsistent results especially since the first 10 rows were from `tt` # [1] 0 1 1 1 1 1 1 0 1 0 1 0 0 1 0 0 1 0 1 0 1 1 #similarly for `tt2` and `tt3`. ##Created this function. It seems to work in the tested cases, though it is not tested extensively. fun1- function(dat,colName,newColumn){ indx- which(dat[,colName]==buy) dat[,newColumn]-0 dat[unlist(lapply(seq_along(indx),function(i){ x1- if(i==length(indx)){ seq(indx[i],nrow(dat)) } else if((indx[i+1]-indx[i])==1){ indx[i] } else { seq(indx[i]+1,indx[i+1]-1) } x2- dat[unique(c(indx[i:1],x1)),] x3- subset(x2,response==sample) x4- subset(x2,response==buy) if(nrow(x3)!=0) { row.names(x3)[x3$product%in% x4$product] } })),newColumn]-1 dat } fun1(tt,response,newCol) # subj response product rown newCol #1 1 sample 1 1 0 #2 1 sample 2 2 0 #3 1 buy 3 3 0 #4 2 sample 2 4 0 #5 2 buy 2 5 0 #6 3 sample 3 6 1 #7 3 sample 2 7 1 #8 3 buy 1 8 0 #9 4 sample 1 9 1 #10 4 buy 4 10 0 fun1(tt1,response,newCol) # subj response product newCol #1 1 sample 1 0 #2 1 sample 2 0 #3 1 buy 3 0 #4 2 sample 2 0 #5 2 buy 2 0 #6 3 sample 3 1 #7 3 sample 2 1 #8 3 buy 1 0 #9 4 sample 1 1 #10 4 buy 4 0 #11 5 buy 4 0 #12 5 sample 2 1 #13 5 buy 2 0 #14 6 buy 4 0 #15 6 sample 5 0 #16 6 sample 5 0 #17 7 sample 4 1 #18 7 buy 3 0 #19 7 buy 4 0 #20 8 buy 5 0 #21 8 sample 4 1 #22 8 buy 2 0 #Also fun1(tt2,response,newCol) fun1(tt3,response,newCol) A.K. P.S. Below is OP's clarification regarding the conditional statement in a private message: I am sorry i didnt question it very clearly, let me change the conditional statement, I hope you can understand. i will explain by example as you can see, almost every number is duplicated, but only in row 6th,7th,and 9th the value on column is 1. on row4th, the value is duplicated( 2 already occurred on 2nd row),but since the value is considered as duplicated only if the value is duplicated where the response is 'buy' than the value on column, on row4th still zero. On row 6th, where the value product column is 3. 3 is already occurred in 3rd row where the value on response is 'buy', so the value on column should be 1 I hope it can understand the conditional statement. - Original Message - From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Cc: R-help@r-project.org; Uwe Ligges lig...@statistik.tu-dortmund.de Sent: Friday, July 26, 2013 5:16 PM Subject: Re: [R] Duplicated function with conditional statement On Jul 26, 2013, at 2:06 PM, David Winsemius wrote: On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote: On
Re: [R] Combine multiple random forests contained in a list
HI, Using the example in ?combine library(randomForest) rf1 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf2 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf3 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf.all - combine(rf1, rf2, rf3) lst1- list(rf1,rf2,rf3) rf.allL- do.call(`combine`,lst1) #or rf.allL- Reduce(`combine,lst1) identical(rf.all,rf.allL) #[1] TRUE A.K. Is there a quick and easy way to pass randomForest objects contained in a list into the combine() function? As a result of calling randomForest through lapply(), I now have 10 randomForests in a list (rfors) I want to combine all 10 of them. Understandably combine(rfors) doesn't work as it doesn't recognise the individual forests within the list. I have spend quite sometime messing around with unlist(), lapply() and apply() to try and extract the information in a suitable format but to no avail. The only thing that works is combine(rfors[[1]], rfors[[2]] ...etc). This is a bit cumbersome though, not least because the number of random forests I'll need to combine is likely to change. Any sleek and elegant solution to this someone can suggest? Thanks in advance for any help. Anna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with ldpaths and new R
Hello! I have just installed R on an Ubutnu machine (13.04) and keep getting the following: erin@erin-Lenovo-IdeaPad-Y480:~$ R /usr/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory R version 3.0.1 (2013-05-16) -- Good Sport Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. install.packages(pbdMPI,depen=TRUE) Installing package into /home/erin/lib/R/library (as lib is unspecified) --- Please select a CRAN mirror for use in this session --- also installing the dependency rlecuyer trying URL ' http://cran.revolutionanalytics.com/src/contrib/rlecuyer_0.3-3.tar.gz' Content type 'application/x-gzip' length 11756 bytes (11 Kb) opened URL == downloaded 11 Kb trying URL ' http://cran.revolutionanalytics.com/src/contrib/pbdMPI_0.1-8.tar.gz' Content type 'application/x-gzip' length 417547 bytes (407 Kb) opened URL == downloaded 407 Kb /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or directory /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory Error in file(con, r) : cannot open the connection Calls: Anonymous - sub - grep - readLines - file In addition: Warning message: In file(con, r) : cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or directory /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory Error in file(con, r) : cannot open the connection Calls: Anonymous - sub - grep - readLines - file In addition: Warning message: In file(con, r) : cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory The downloaded source packages are in /tmp/RtmpAtAxNL/downloaded_packages Warning messages: 1: In install.packages(pbdMPI, depen = TRUE) : installation of package rlecuyer had non-zero exit status 2: In install.packages(pbdMPI, depen = TRUE) : installation of package pbdMPI had non-zero exit status I uninstalled it, deleted the /etc/R and /usr/lib/R directories, and re-installed. (Actually, I followed that process twice). I'm still stuck. Does anyone have any suggestions, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with ldpaths and new R
On Jul 26, 2013, at 9:45 PM, Erin Hodgess wrote: Hello! I have just installed R on an Ubutnu machine (13.04) Did you follow directions given here: http://cran.us.r-project.org/bin/linux/ubuntu/ Users who need to compile R packages from source [e.g. package maintainers, or anyone installing packages with install.packages()] should also install the r-base-dev package: sudo apt-get install r-base-dev I ask because one ot the other directions that you did not follow was: The best place to report problems with these packages or ask R questions specific to Ubuntu is the R-SIG-Debian mailing list. See https://stat.ethz.ch/mailman/listinfo/r-sig-debian -- David. and keep getting the following: erin@erin-Lenovo-IdeaPad-Y480:~$ R /usr/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory R version 3.0.1 (2013-05-16) -- Good Sport Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. install.packages(pbdMPI,depen=TRUE) Installing package into Œ/home/erin/lib/R/library‚ (as Œlib‚ is unspecified) --- Please select a CRAN mirror for use in this session --- also installing the dependency Œrlecuyer‚ trying URL ' http://cran.revolutionanalytics.com/src/contrib/rlecuyer_0.3-3.tar.gz' Content type 'application/x-gzip' length 11756 bytes (11 Kb) opened URL == downloaded 11 Kb trying URL ' http://cran.revolutionanalytics.com/src/contrib/pbdMPI_0.1-8.tar.gz' Content type 'application/x-gzip' length 417547 bytes (407 Kb) opened URL == downloaded 407 Kb /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or directory /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory Error in file(con, r) : cannot open the connection Calls: Anonymous - sub - grep - readLines - file In addition: Warning message: In file(con, r) : cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or directory /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory Error in file(con, r) : cannot open the connection Calls: Anonymous - sub - grep - readLines - file In addition: Warning message: In file(con, r) : cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory The downloaded source packages are in Œ/tmp/RtmpAtAxNL/downloaded_packages‚ Warning messages: 1: In install.packages(pbdMPI, depen = TRUE) : installation of package Œrlecuyer‚ had non-zero exit status 2: In install.packages(pbdMPI, depen = TRUE) : installation of package ŒpbdMPI‚ had non-zero exit status I uninstalled it, deleted the /etc/R and /usr/lib/R directories, and re-installed. (Actually, I followed that process twice). I'm still stuck. Does anyone have any suggestions, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combine multiple random forests contained in a list
I would say that the use of Reduce in this context is bad practice. from ?Reduce : Reduce uses a binary function to successively combine the elements of a given vector and a possibly given initial value. combine() is obviously not a binary function. do.call() seems to be THE appropriate idiom. -- Bert On Fri, Jul 26, 2013 at 9:26 PM, arun smartpink...@yahoo.com wrote: HI, Using the example in ?combine library(randomForest) rf1 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf2 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf3 - randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE) rf.all - combine(rf1, rf2, rf3) lst1- list(rf1,rf2,rf3) rf.allL- do.call(`combine`,lst1) #or rf.allL- Reduce(`combine,lst1) identical(rf.all,rf.allL) #[1] TRUE A.K. Is there a quick and easy way to pass randomForest objects contained in a list into the combine() function? As a result of calling randomForest through lapply(), I now have 10 randomForests in a list (rfors) I want to combine all 10 of them. Understandably combine(rfors) doesn't work as it doesn't recognise the individual forests within the list. I have spend quite sometime messing around with unlist(), lapply() and apply() to try and extract the information in a suitable format but to no avail. The only thing that works is combine(rfors[[1]], rfors[[2]] ...etc). This is a bit cumbersome though, not least because the number of random forests I'll need to combine is likely to change. Any sleek and elegant solution to this someone can suggest? Thanks in advance for any help. Anna __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.