date:20140414

Re: [R] a question about the output of plot

2014-04-14 Thread Jim Lemon


On 04/15/2014 11:46 AM, meng wrote:

Hi all:
I met a question about the output of plot.
I want to output 3 plots.
Method1: by function histogram{lattice}
Method2: by function hist{graphics}


But method1 failed(the output is empty),and only method 2 works.
I can't find out the reason,and many thanks for your help.


#Method1---failed(the output is empty)
library(lattice)
for(i in 1:3)
{
x<-rnorm(10)


jpeg(paste("e:\\hist_",i,".jpeg"))
histogram(x)
dev.off()
}




#Method 2---works
for(i in 1:3)
{
x<-rnorm(10)


jpeg(paste("e:\\hist_",i,".jpeg"))
hist(x)
dev.off()
}


Hi meng,
Try:

print(histogram(x))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating multiple line graphs

2014-04-14 Thread Jim Lemon


On 04/15/2014 12:27 PM, dila radi wrote:

Hi all,

I tried to draw multiple line graphs, and this is my data:

structure(list(X = structure(c(3L, 1L, 2L), .Label = c("10%",
"20%", "5%"), class = "factor"), NRM = c(0.993112, 0.9757191,
0.9709928), AAM = c(0.9928504, 0.9764055, 0.9702813), IDW = c(0.9923301,
0.9737133, 0.9640287), CCM = c(0.9929805, 0.9768217, 0.9708724
), MI = c(0.9931722, 0.9715817, 0.9649249)), .Names = c("X",
"NRM", "AAM", "IDW", "CCM", "MI"), row.names = c(NA, 3L), class =
"data.frame")


Im using these as my codes:
y-axis is the amount of S-index (from the data given range from 0.99 - 1.0)
x-axis is the percentage (5%, 10% and 20%)

par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
plot(dt[,2], xaxt = "n",xlab="Percentage of Mising",ylab="S-index",
  main="Performance of S-Index for Different Percentage",
  ylim=c(0.99,1),type="l",col="blue",lwd=3)
lines(dt[,3],col="black",lwd=3,lty=2)
lines(dt[,4],col="red",lwd=3,type="l")
lines(dt[,5],col="green3",lwd=3,type="l")
lines(dt[,6],col="orange",lwd=3,lty=2)
axis(1,at=1:3,c("5%","10%", "20%"))
legend("topright", bty="n",c("NRM","AAM","IDW","CCM","MI"),
lwd=c(3,3,3,3,3), lty =c(1,2,1,4),col=
c("blue","black","green3","red","orange"))

I guess there is more sophisticated way to do it. Need your help. Thank you
so much.


Hi Dila,
Try this:

matplot(as.matrix(dt[,2:6]),type="b",
 col=c("blue","black","green3","red","orange"),
 pch=c("N","A","I","C","M"),lty=1:5,lwd=2)
legend("topright", bty="n",c("NRM","AAM","IDW","CCM","MI"),
 lty=1:5,col=c("blue","black","green3","red","orange"),
 pch=c("N","A","I","C","M"),lwd=2)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] : Quantile and rowMean from multiple files in a folder

2014-04-14 Thread Zilefac Elvis

Hi AK,
All codes for simulation files work great.
I will try the code for observations and let you know.
Thanks very much.
Atem.







On Tuesday, April 15, 2014 12:01 AM, arun  wrote:
Yes,
my new solution ignores such cases.







On Monday, April 14, 2014 11:58 PM, Zilefac Elvis  
wrote:
Hi AK,
Please ignore any such site.
I will check it and include in the analysis.
Thanks,
Atem.



On Monday, April 14, 2014 9:34 PM, arun  wrote:



Hi,

I looked at your Observed.zip.  In that one of the file is without any data:
GG83_Sim.csv.ind.csv
The contents of the file are just:

Year    
Year    
trend    
p    < 
 

A.K.


On Monday, April 14, 2014 10:41 PM, Zilefac Elvis  
wrote:
Hi AK,
Q1) Please try to correct the error using the larger data set (Sample.zip). The 
issue is that once you write the codes and restrict it to smaller data sets, I 
find it difficult to generalize it to larger data sets.

Q2) From the Quantilecode2.txt you just sent, you forgot to do the following 
section using the Observed.zip file. I tried to run the code to section Q1 in 
Quantilecode2.txt using a larger data set and received the same error :Error in 
2:nrow(lstNew) : argument of length 0. I have attached a larger data set too 
for you to generalize the code to suit the larger data set. Please do not 
forget to include the code below in the final code of Q2.


Once you fix these two, I should be able to fix the rest following these 
examples.

Thanks AK. Sorry for overloading you with much work.
Atem.

#==
dir.create("Indices") 
names1 <- lapply(ReadOut1, function(x) names(x))[[1]]
lstNew <- simplify2array(ReadOut1) lapply(2:nrow(lstNew), function(i) { dat1 <- 
data.frame(lstNew[1], do.call(cbind, lstNew[i, ]), stringsAsFactors = FALSE) 
colnames(dat1) <- c(rownames(lstNew)[1], paste(names(lst1), 
rep(rownames(lstNew)[i],  length(lst1)), sep = "_")) 
write.csv(dat1, paste0(paste(getwd(), "Indices", rownames(lstNew)[i], sep = 
"/"),  ".csv"), row.names = FALSE, quote = FALSE)
})  
## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", 
list.files(recursive = TRUE))],  function(x) read.csv(x, header = TRUE, 
stringsAsFactors = FALSE))
length(ReadOut2)
# [1] 257
head(ReadOut2[[1]], 2) 

#==




On Monday, April 14, 2014 8:07 PM, arun  wrote:

HI,

Please send your emails in plain text.  If you had looked at the dimensions of 
`lst2`:
sapply(lst2,function(x) sapply(x,ncol))[1:6,]
     G100 G101 G102 G103 G104 G105 G106 G107 G108 G109 G110 G111 G112 G113 G114
[1,]  258  258  258  258  258  257  258  258  258  258  258  258  258  258  247
[2,]  258  258  258  258  258  258  258  258  258  258  258  258  258  258  258
[3,]  258  258  258  258  258  258  258  258  258  258  258  258  258  258  257
[4,]  258  258  258  258  258  257  258  258  258  258  258  258  258  258  258
[5,]  258  258  258  258  258  258  258  258  258  258  258  258  258  258  258
[6,]  258  258  258  258  258  258  258  258  258  258  258  258  258  258  258
     G115 G116 G117 G118 G119 G120 GG10 GG11 GG12 GG13 GG14 GG15 GG16 GG17 GG18
[1,]  258  247  256  256  258  258  258  258  258  258  258  258  258  257  258
[2,]  258  250  257  258  258  256  258  258  258  258  258  258  258  258  258
[3,]  258  247  256  258  258  256  258  258  258  258  258  258  258  258  256
[4,]  258  258  258  257  258  258  258  258  258  258  258  258  258  257  258
[5,]  258  257  258  258  258  256  258  258  258  258  258  258  258  258  258
[6,]  258  257  249  257  258  258  258  258  258  258  258  258  258  258  258
     GG19 GG20 GG21 GG22 GG23 GG24 GG25 GG26 GG27 GG28
[1,]  258  258  258  258  258  258  258  258  258  258
[2,]  258  258  258  258  258  258  258  258  258  258
[3,]  258  258  257  258  256  257  258  258  258  258
[4,]  258  257  258  258  258  257  258  258  258  258
[5,]  258  258  257  258  257  258  258  258  258  258
[6,]  258  258  258  258  257  258  258  258  258  258 


#the dimensions are not consistent for the Simulations
within each Site.  My codes assumed that all the datasets were having the same 
number of columns, rows etc.






On Monday, April 14, 2014 6:26 PM, Zilefac Elvis  wrote:

Hi AK,
I have another request for help.
Attached is a larger file (~27MB) for sample.zip. All files are same as 
previous except that I am using more sites to do the same thing that you did 
with sample.zip.

When generalizing Quantilecode.R to many sites, I receive an error when I run:

dir.create("Indices")
names1 <- lapply(ReadOut1, function(x) names(x))[[1]]
lstNew <- simplify2array(ReadOut1)

lapply(2:nrow(lstNew), function(i) {
  dat1 <- data.frame(lstNew[1], do.call(cbind, lstNew[i, ]), stringsAsFactors = 
FALSE)
  colnames(dat1) <- c(rownames(lstNew)[1], paste(names(lst1), 
rep(rownames(lstNew)[i],

[R] a question about the output of plot

2014-04-14 Thread meng

Hi all:
I met a question about the output of plot.
I want to output 3 plots.
Method1: by function histogram{lattice}
Method2: by function hist{graphics} 


But method1 failed(the output is empty),and only method 2 works.
I can't find out the reason,and many thanks for your help.


#Method1---failed(the output is empty)
library(lattice)
for(i in 1:3)
{
x<-rnorm(10)


jpeg(paste("e:\\hist_",i,".jpeg"))
histogram(x)
dev.off()
}




#Method 2---works
for(i in 1:3)
{
x<-rnorm(10)


jpeg(paste("e:\\hist_",i,".jpeg"))
hist(x)
dev.off()
}




Many thanks.
My best
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] error when installing package after installing R-3.1.0 on windows

2014-04-14 Thread Linda Peng

Hi,
 
I encountered following error" Warning in install.packages("xlsx") :   'lib = 
"C:/Program Files/R/R-3.1.0/library"' is not writable". This is right after I 
installed R-3.1.0 on windows after previous R-2.15.2 which is still existing.
 
I checked the folder permission and it doesn't have read-only set. Did anybody 
experienced the same or have suggestions what to check or fix?
 
Thanks for the help,
 
Linda
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantile and rowMean from multiple files in a folder

2014-04-14 Thread arun

Hi Atem,

I guess this is what you wanted.

###Q1: 
###
###working directory: Observed
 #Only one file per Site.  Assuming this is the case for the full dataset, then 
I guess there is no need to average

dir.create("final")
lst1 <- split(list.files(pattern = ".csv"), gsub("\\_.*", "", 
list.files(pattern = ".csv")))

lst2 <-  lapply(lst1,function(x1) lapply(x1, function(x2) {lines1 <- 
readLines(x2); header1 <- lines1[1:2]; dat1 <- 
read.table(text=lines1,header=FALSE,sep=",",stringsAsFactors=FALSE, skip=2); 
colnames(dat1) <- 
Reduce(paste,strsplit(header1,","));dat1[-c(nrow(dat1),nrow(dat1)-1),]}))


#different number of rows
 sapply(seq_along(lst2),function(i){lstN <- lapply(lst2[[i]],function(x) 
x[,-1]);sapply(lstN,function(x) nrow(x))})
 #[1] 9 9 9 8 2 9

#difference in number of columns
sapply(seq_along(lst2),function(i) {sapply(lst2[[i]],function(x) ncol(x))})
 #[1] 157 258 258  98 157 258

library(plyr)
library(stringr)

lst3 <- setNames(lapply(seq_along(lst2),function(i) 
{lapply(lst2[[i]],function(x) {names(x)[-1] <- paste(names(x)[-1], 
names(lst1)[i],sep="_"); names(x) <- str_trim(names(x)); x})[[1]]}), 
names(lst1)) 

df1 <- join_all(lst3,by="Year")
dim(df1)
 #[1]    9 1181 


sapply(split(names(df1)[-1] ,gsub(".*\\_","",names(df1)[-1])),function(x) {df2 
<- df1[,x];df3 <- data.frame(Percentiles=paste0(seq(0,100, by=1) ,"%"), 
numcolwise(function(y) 
quantile(y,seq(0,1,by=0.01),na.rm=TRUE))(df2),stringsAsFactors=FALSE);ncol(df3) 
})
 #G100 G101 G102 G103 G104 G105 
# 157  258  258   98  157  258 

lst4 <- split(names(df1)[-1] ,gsub(".*\\_","",names(df1)[-1]))

lapply(seq_along(lst4),function(i) {df2 <- df1[,lst4[[i]]]; df3 <- 
data.frame(Percentiles=paste0(seq(0,100, by=1) ,"%"), numcolwise(function(y) 
quantile(y,seq(0,1,by=0.01),na.rm=TRUE))(df2),stringsAsFactors=FALSE);df3[1:3,1:3];
 write.csv(df3,paste0(paste(getwd(), 
"final",paste(names(lst1)[[i]],"Quantile",sep="_"),sep="/"),".csv"),row.names=FALSE,quote=FALSE)})
 

ReadOut1 <- 
lapply(list.files(recursive=TRUE)[grep("Quantile",list.files(recursive=TRUE))],function(x)
 read.csv(x,header=TRUE,stringsAsFactors=FALSE)) 

sapply(ReadOut1,dim)
#     [,1] [,2] [,3] [,4] [,5] [,6]
 #[1,]  101  101  101  101  101  101 
#[2,]  157  258  258   98  157  258

lapply(ReadOut1,function(x) x[1:2,1:3])[1:3]
 #[[1]] 
#  Percentiles pav.DJF_G100 pav.MAM_G100 
#1          0%            0     0.640500 
#2          1%            0     0.664604 
# 
#[[2]] 
#  Percentiles txav.DJF_G101 txav.MAM_G101
 #1          0%      -13.8756      4.742400 
#2          1%      -13.8140      4.817184
 #
 #[[3]] 
#  Percentiles txav.DJF_G102 txav.MAM_G102
 #1          0%     -15.05000      4.520700
 #2          1%     -14.96833      4.543828 
#
###Q2: 
###Observed data 

dir.create("Indices")
 names1 <- unlist(lapply(ReadOut1,function(x)
 names(x)[-1])) 
names2 <-  gsub("\\_.*","",names1)
 names3 <- unique(gsub("[.]", " ", names2)) 

res <- do.call(rbind,lapply(seq_along(lst4),function(i) {df2 <- 
df1[,lst4[[i]]];vec1 <- colMeans(df2,na.rm=TRUE); vec2 <- 
rep(NA,length(names3));names(vec2) <- paste(names3,names(lst4)[[i]],sep="_"); 
vec2[names(vec2) %in% names(vec1)] <- vec1; names(vec2) <- 
gsub("\\_.*","",names(vec2)); vec2  }))


lapply(seq_len(ncol(res)),function(i) {mat1 <- 
t(res[,i,drop=FALSE]);colnames(mat1) <- names(lst4); 
write.csv(mat1,paste0(paste(getwd(),"Indices", gsub(" 
","_",rownames(mat1)),sep="/"),".csv"),row.names=FALSE,quote=FALSE)})

##Output2:
ReadOut2 <- 
lapply(list.files(recursive=TRUE)[grep("Indices",list.files(recursive=TRUE))],function(x)
 read.csv(x,header=TRUE,stringsAsFactors=FALSE)) 

length(ReadOut2) 

#[1] 257


list.files(recursive=TRUE)[grep("Indices",list.files(recursive=TRUE))][1]
#[1] "Indices/pav_ANN.csv" 

res[,"pav ANN",drop=FALSE] 

#  pav ANN
#[1,] 1.298811
#[2,] 7.642922 

#[3,] 6.740011 

#[4,]   NA
#[5,] 1.296650 

#[6,] 6.887622 


ReadOut2[[1]]
#  G100 G101 G102 G103G104 G105
#1 1.298811 7.642922 6.740011   NA 1.29665 6.887622 

###Sample data 

###Working directory changed to "sample" 

dir.create("Indices_colMeans")

lst1 <- 
split(list.files(pattern=".csv"),gsub("\\_.*","",list.files(pattern=".csv"))) 

lst2 <-  lapply(lst1,function(x1) lapply(x1, function(x2) {lines1 <- 
readLines(x2); header1 <- lines1[1:2]; dat1 <- 
read.table(text=lines1,header=FALSE,sep=",",stringsAsFactors=FALSE, skip=2); 
colnames(dat1) <- 
Reduce(paste,strsplit(header1,","));dat1[-c(nrow(dat1),nrow(dat1)-1),]}))
res1 <- do.call(rbind,lapply(seq_along(lst2),function(i) 
{rowMeans(do.call(cbind,lapply(lst2[[i]],function(x) 
colMeans(x[,-1],na.rm=TRUE))),na.rm=TRUE) })) 

lapply(seq_len(ncol(res1)),function(i){mat1 <- t(res1[,i,drop=FALSE]); 
colnames(mat1) <- 
names(lst2);write.csv(mat1,paste0(paste(getwd(),"Indices_colMeans",gsub(" 
","_",rownames(mat1)),sep="/"),".csv"),row.names=FALSE,quote=FALSE)})

##Output2 Sample
ReadOut2S <- 
lapply(list.files(recursive=TRUE)[grep("Indices",list.files(recursive=TRUE))],funct

Re: [R] Quantile and rowMean from multiple files in a folder

2014-04-14 Thread zilefacel...@yahoo.com


   Hi AK,

   Thanks very much.

   I  did  send  you  another  email  with  a larger Sample.zip file. The
   Quantilecode.R which you initially developed for a smaller sample.zip did
   not complete the task when I used it for a larger data set. Please check to
   rectify the error message.


   Thanks,

   Atem.
   -- Original Message --

 From : arun
 To : R. Help;
 Cc : Zilefac Elvis;
 Sent : 14-04-2014 18:57
 Subject : Re: Quantile and rowMean from multiple files in a folder

Hi Atem,

I guess this is what you wanted.

###Q1: 
###
###working directory: Observed
 #Only one file per Site.  Assuming this is the case for the full dataset, then
 I guess there is no need to average

dir.create("final")
lst1 <- split(list.files(pattern = ".csv"), gsub("\\_.*", "", list.files(patter
n = ".csv")))

lst2 <-  lapply(lst1,function(x1) lapply(x1, function(x2) {lines1 <- readLines(
x2); header1 <- lines1[1:2]; dat1 <- read.table(text=lines1,header=FALSE,sep=",
",stringsAsFactors=FALSE, skip=2); colnames(dat1) <- Reduce(paste,strsplit(head
er1,","));dat1[-c(nrow(dat1),nrow(dat1)-1),]}))


#different number of rows
 sapply(seq_along(lst2),function(i){lstN <- lapply(lst2[[i]],function(x) x[,-1]
);sapply(lstN,function(x) nrow(x))})
 #[1] 9 9 9 8 2 9

#difference in number of columns
sapply(seq_along(lst2),function(i) {sapply(lst2[[i]],function(x) ncol(x))})
 #[1] 157 258 258  98 157 258

library(plyr)
library(stringr)

lst3 <- setNames(lapply(seq_along(lst2),function(i) {lapply(lst2[[i]],function(
x) {names(x)[-1] <- paste(names(x)[-1], names(lst1)[i],sep="_"); names(x) <- st
r_trim(names(x)); x})[[1]]}), names(lst1)) 

df1 <- join_all(lst3,by="Year")
dim(df1)
 #[1]9 1181 


sapply(split(names(df1)[-1] ,gsub(".*\\_","",names(df1)[-1])),function(x) {df2 
<- df1[,x];df3 <- data.frame(Percentiles=paste0(seq(0,100, by=1) ,"%"), numcolw
ise(function(y) quantile(y,seq(0,1,by=0.01),na.rm=TRUE))(df2),stringsAsFactors=
FALSE);ncol(df3) })
 #G100 G101 G102 G103 G104 G105 
# 157  258  258   98  157  258 

lst4 <- split(names(df1)[-1] ,gsub(".*\\_","",names(df1)[-1]))

lapply(seq_along(lst4),function(i) {df2 <- df1[,lst4[[i]]]; df3 <- data.frame(P
ercentiles=paste0(seq(0,100, by=1) ,"%"), numcolwise(function(y) quantile(y,seq
(0,1,by=0.01),na.rm=TRUE))(df2),stringsAsFactors=FALSE);df3[1:3,1:3]; write.csv
(df3,paste0(paste(getwd(), "final",paste(names(lst1)[[i]],"Quantile",sep="_"),s
ep="/"),".csv"),row.names=FALSE,quote=FALSE)}) 

ReadOut1 <- lapply(list.files(recursive=TRUE)[grep("Quantile",list.files(recurs
ive=TRUE))],function(x) read.csv(x,header=TRUE,stringsAsFactors=FALSE)) 

sapply(ReadOut1,dim)
# [,1] [,2] [,3] [,4] [,5] [,6]
 #[1,]  101  101  101  101  101  101 
#[2,]  157  258  258   98  157  258

lapply(ReadOut1,function(x) x[1:2,1:3])[1:3]
 #[[1]] 
#  Percentiles pav.DJF_G100 pav.MAM_G100 
#1  0%0 0.640500 
#2  1%0 0.664604 
# 
#[[2]] 
#  Percentiles txav.DJF_G101 txav.MAM_G101
 #1  0%  -13.8756  4.742400 
#2  1%  -13.8140  4.817184
 #
 #[[3]] 
#  Percentiles txav.DJF_G102 txav.MAM_G102
 #1  0% -15.05000  4.520700
 #2  1% -14.96833  4.543828 
#
###Q2: 
###Observed data 

dir.create("Indices")
 names1 <- unlist(lapply(ReadOut1,function(x)
 names(x)[-1])) 
names2 <-  gsub("\\_.*","",names1)
 names3 <- unique(gsub("[.]", " ", names2)) 

res <- do.call(rbind,lapply(seq_along(lst4),function(i) {df2 <- df1[,lst4[[i]]]
;vec1 <- colMeans(df2,na.rm=TRUE); vec2 <- rep(NA,length(names3));names(vec2) <
- paste(names3,names(lst4)[[i]],sep="_"); vec2[names(vec2) %in% names(vec1)] <-
 vec1; names(vec2) <- gsub("\\_.*","",names(vec2)); vec2  }))


lapply(seq_len(ncol(res)),function(i) {mat1 <- t(res[,i,drop=FALSE]);colnames(m
at1) <- names(lst4); write.csv(mat1,paste0(paste(getwd(),"Indices", gsub(" ","_
",rownames(mat1)),sep="/"),".csv"),row.names=FALSE,quote=FALSE)})

##Output2:
ReadOut2 <- lapply(list.files(recursive=TRUE)[grep("Indices",list.files(recursi
ve=TRUE))],function(x) read.csv(x,header=TRUE,stringsAsFactors=FALSE)) 

length(ReadOut2) 

#[1] 257


list.files(recursive=TRUE)[grep("Indices",list.files(recursive=TRUE))][1]
#[1] "Indices/pav_ANN.csv" 

res[,"pav ANN",drop=FALSE] 

#  pav ANN
#[1,] 1.298811
#[2,] 7.642922 

#[3,] 6.740011 

#[4,]   NA
#[5,] 1.296650 

#[6,] 6.887622 


ReadOut2[[1]]
#  G100 G101 G102 G103G104 G105
#1 1.298811 7.642922 6.740011   NA 1.29665 6.887622 

###Sample data 

###Working directory changed to "sample" 

dir.create("Indices_colMeans")

lst1 <- split(list.files(pattern=".csv"),gsub("\\_.*","",list.files(pattern=".c
sv"))) 

lst2 <-  lapply(lst1,function(x1) lapply(x1, function(x2) {lines1 <- readLines(
x2); header1 <- lines1[1:2]; dat1 <- read.table(text=lines1,header=FALSE,sep=",
",stringsAsFactors=FALSE, skip=2); colnames(dat1) <- Reduce(paste,strsplit(head
er1,","));dat1[-c(nrow(dat1),nrow(dat1)-1),]}))
res1

[R] creating multiple line graphs

2014-04-14 Thread dila radi

Hi all,

I tried to draw multiple line graphs, and this is my data:

structure(list(X = structure(c(3L, 1L, 2L), .Label = c("10%",
"20%", "5%"), class = "factor"), NRM = c(0.993112, 0.9757191,
0.9709928), AAM = c(0.9928504, 0.9764055, 0.9702813), IDW = c(0.9923301,
0.9737133, 0.9640287), CCM = c(0.9929805, 0.9768217, 0.9708724
), MI = c(0.9931722, 0.9715817, 0.9649249)), .Names = c("X",
"NRM", "AAM", "IDW", "CCM", "MI"), row.names = c(NA, 3L), class =
"data.frame")


Im using these as my codes:
y-axis is the amount of S-index (from the data given range from 0.99 - 1.0)
x-axis is the percentage (5%, 10% and 20%)

par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
plot(dt[,2], xaxt = "n",xlab="Percentage of Mising",ylab="S-index",
 main="Performance of S-Index for Different Percentage",
 ylim=c(0.99,1),type="l",col="blue",lwd=3)
lines(dt[,3],col="black",lwd=3,lty=2)
lines(dt[,4],col="red",lwd=3,type="l")
lines(dt[,5],col="green3",lwd=3,type="l")
lines(dt[,6],col="orange",lwd=3,lty=2)
axis(1,at=1:3,c("5%","10%", "20%"))
legend("topright", bty="n",c("NRM","AAM","IDW","CCM","MI"),
lwd=c(3,3,3,3,3), lty =c(1,2,1,4),col=
c("blue","black","green3","red","orange"))

I guess there is more sophisticated way to do it. Need your help. Thank you
so much.

Regards,
Dila

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] : Quantile and rowMean from multiple files in a folder

2014-04-14 Thread arun





Hi,
Q1 solution already sent.

Regarding Q2, one of the files in the new Observed folder doesn't have any  
data (just the Year column alone).

That may be the reason for the problem.

### Q1: working directory: Observed #Only one file per Site.  Assuming this is 
the
### case for the full dataset, then I guess there is no need to average
dir.create("final")
lst1 <- split(list.files(pattern = ".csv"), gsub("\\_.*", "", 
list.files(pattern = ".csv")))

lst2 <- lapply(lst1, function(x1) lapply(x1, function(x2) {
lines1 <- readLines(x2)
header1 <- lines1[1:2]
dat1 <- read.table(text = lines1, header = FALSE, sep = ",", 
stringsAsFactors = FALSE, 
skip = 2)
colnames(dat1) <- Reduce(paste, strsplit(header1, ","))
dat1[-c(nrow(dat1), nrow(dat1) - 1), ]
}))

lst3 <- lst2[sapply(seq_along(lst2),function(i){lstN <- 
sapply(lst2[[i]],function(x) is.integer(ncol(x)))})]


#difference in column number
sapply(seq_along(lst3), function(i) {
sapply(lst3[[i]], function(x) ncol(x))
})
# 
#[1] 157 258 258  98 157 258 256 258 250 258 258 147 157 250 250 256 249 240
# [19] 181 188 256 146 117 258 153 256 255 246 255 256 258 257 145 258 258 255
# [37] 258 157 164 144 265 258 254 258 258 157 258 176 258 256 257 258 258 258
# [55] 248 258 156 258 157 157 258 258 258 258 258 148 258 258 258 258 257 258
# [73] 258 258 157 154 153 258 248 255 257 256 258 258 157 256 256 257 257 250
# [91] 257 139 155 256 256 257 257 256 258 258 257 258 258 258 258 157 157 157
#[109] 258 258 258 258 256 258 157 258 258 256 258

library(plyr)
library(stringr)

lst4 <- setNames(lapply(seq_along(lst3), function(i) {
lapply(lst3[[i]], function(x) {
names(x)[-1] <- paste(names(x)[-1], names(lst1)[i], sep = "_")
names(x) <- str_trim(names(x))
x
})[[1]]
}), names(lst3))

df1 <- join_all(lst4, by = "Year")
dim(df1)
# [1] 9 27311

sapply(split(names(df1)[-1], gsub(".*\\_", "", names(df1)[-1])), function(x) {
df2 <- df1[, x]
df3 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), 
numcolwise(function(y) quantile(y, 
seq(0, 1, by = 0.01), na.rm = TRUE))(df2), stringsAsFactors = FALSE)
ncol(df3)
})
# 
#G100 G101 G102 G103 G104 G105 G106 G107 G108 G109 G110 G111 G112 G113 G114 
G115 
# 157  258  258   98  157  258  256  258  250  258  258  147  157  250  250  
256 
#G116 G117 G118 G119 G120 GG10 GG11 GG12 GG13 GG14 GG15 GG16 GG17 GG18 GG19 
GG20 
# 249  240  181  188  256  146  117  258  153  256  255  246  255  256  258  
257 
#GG21 GG22 GG23 GG24 GG25 GG26 GG27 GG28 GG29 GG30 GG31 GG32 GG33 GG34 GG35 
GG36 
# 145  258  258  255  258  157  164  144  265  258  254  258  258  157  258  
176 
#GG37 GG38 GG39 GG40 GG41 GG42 GG43 GG44 GG45 GG46 GG47 GG48 GG49 GG50 GG51 
GG52 
# 258  256  257  258  258  258  248  258  156  258  157  157  258  258  258  
258 
#GG53 GG54 GG55 GG56 GG57 GG58 GG59 GG60 GG61 GG62 GG63 GG64 GG65 GG66 GG67 
GG68 
# 258  148  258  258  258  258  257  258  258  258  157  154  153  258  248  
255 
#GG69 GG70 GG71 GG72 GG73 GG74 GG75 GG76 GG77 GG78 GG79 GG80 GG81 GG82 GG83 
GG84 
# 257  256  258  258  157  256  256  257  257  250  257  139  155  256  256  
257 
#GG85 GG86 GG87 GG88 GG89 GG90 GG91 GG92 GG93 GG94 GG95 GG96 GG97 GG98 GG99 
GGG1 
# 257  256  258  258  257  258  258  258  258  157  157  157  258  258  258  
258 
#GGG2 GGG3 GGG4 GGG5 GGG6 GGG7 GGG8 
# 256  258  157  258  258  256  258 



lst5 <- split(names(df1)[-1], gsub(".*\\_", "", names(df1)[-1]))

lapply(seq_along(lst5), function(i) {
df2 <- df1[, lst5[[i]]]
df3 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), 
numcolwise(function(y) quantile(y, 
seq(0, 1, by = 0.01), na.rm = TRUE))(df2), stringsAsFactors = FALSE)
df3[1:3, 1:3]
write.csv(df3, paste0(paste(getwd(), "final", paste(names(lst4)[[i]], 
"Quantile", 
sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", 
list.files(recursive = TRUE))], 
function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))

sapply(ReadOut1, dim)[,1:3]
# [,1] [,2] [,3]
#[1,]  101  101  101
#[2,]  157  258  258


lapply(ReadOut1, function(x) x[1:2, 1:3])[1:3]
#[[1]]
#  Percentiles pav.DJF_G100 pav.MAM_G100
#1  0%0 0.640500
#2  1%0 0.664604
#
#[[2]]
#  Percentiles txav.DJF_G101 txav.MAM_G101
#1  0%  -13.8756  4.742400
#2  1%  -13.8140  4.817184
#
#[[3]]
#  Percentiles txav.DJF_G102 txav.MAM_G102
#1  0% -15.05000  4.520700
#2  1% -14.96833  4.543828


### Q2: Observed data

dir.create("Indices")

names1 <- unlist(lapply(ReadOut1, function(x) names(x)[-1]))
names2 <- gsub("\\_.*", "", names1)
names3 <- unique(gsub("[.]", " ", names2))

res <- do.call(rbind, lapply(seq_along(lst5), function(i) {
df2 <- df1[, lst5[[i]]]
vec1 <- colMeans(df2, na.rm = TRUE)
vec2 <- rep(NA, length(names3))
names

Re: [R] : Quantile and rowMean from multiple files in a folder

2014-04-14 Thread arun



Hi,
It is because of different dimensions of Simulation data  within each Site.
Try:
dir.create("final")
lst1 <- split(list.files(pattern = ".csv"), gsub("\\_.*", "", 
list.files(pattern = ".csv")))
sapply(lst1,length)
#G100 G101 G102 G103 G104 G105 G106 G107 G108 G109 G110 G111 G112 G113 G114 
G115 
# 100  100  100  100  100  100  100  100  100  100  100  100  100  100  100  
100 
#G116 G117 G118 G119 G120 GG10 GG11 GG12 GG13 GG14 GG15 GG16 GG17 GG18 GG19 
GG20 
# 100  100  100  100  100  100  100  100  100  100  100  100  100  100  100  
100 
#GG21 GG22 GG23 GG24 GG25 GG26 GG27 GG28 
# 100  100  100  100  100  100  100  100 

lst2 <- lapply(lst1, function(x1) lapply(x1, function(x2) {
    lines1 <- readLines(x2)
    header1 <- lines1[1:2]
    dat1 <- read.table(text = lines1, header = FALSE, sep = ",", 
stringsAsFactors = FALSE, 
        skip = 2)
    colnames(dat1) <- Reduce(paste, strsplit(header1, ","))
    dat1[-c(nrow(dat1), nrow(dat1) - 1), ]
}))

##dimensions differ within each Site
sapply(lst2,function(x) sapply(x,ncol))[1:6,5:8]
#     G104 G105 G106 G107
#[1,]  258  257  258  258
#[2,]  258  258  258  258
#[3,]  258  258  258  258
#[4,]  258  257  258  258
#[5,]  258  258  258  258
#[6,]  258  258  258  258

##number of rows are consistent
sapply(lst2,function(x) any(sapply(x,nrow)!=9))
# G100  G101  G102  G103  G104  G105  G106  G107  G108  G109  G110  G111  G112 
#FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
# G113  G114  G115  G116  G117  G118  G119  G120  GG10  GG11  GG12  GG13  GG14 
#FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
# GG15  GG16  GG17  GG18  GG19  GG20  GG21  GG22  GG23  GG24  GG25  GG26  GG27 
#FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
# GG28 
#FALSE 
names1 <- unique(unlist(lapply(lst2,function(x) unlist(lapply(x,function(y) 
names(y)[-1])
length(names1)
#[1] 257


# lstYear <- lapply(lst2,function(x) lapply(x, function(y)
# y[,1,drop=FALSE])[[1]])

library(plyr)

lapply(seq_along(lst2),function(i) {lstN <- lapply(lst2[[i]],function(x) {datN 
<- as.data.frame(matrix(NA, nrow=9, 
ncol=length(names1),dimnames=list(NULL,names1)));datN[,names1] <- x[,-1]; datN 
}); lstQ1 <- lapply(lstN,function(x) numcolwise(function(y) 
quantile(y,seq(0,1,by=0.01), na.rm=TRUE))(x)); arr1 <- array(unlist(lstQ1), 
dim=c(dim(lstQ1[[1]]),length(lstQ1)),dimnames=list(NULL,lapply(lstQ1,names)[[1]]));
 res <- rowMeans(arr1, dims=2, na.rm=TRUE); colnames(res) <- gsub(" ", "_", 
colnames(res)); res1 <- data.frame(Percentiles=paste0(seq(0,100, 
by=1),"%"),res, stringsAsFactors=FALSE); write.csv(res1,paste0(paste(getwd(), 
"final", paste(names(lst1)[[i]], "Quantile", sep="_"), sep= "/"), ".csv"), 
row.names=FALSE, quote=FALSE)})



## output files
list.files(recursive = TRUE)[grep("Quantile", list.files(recursive = TRUE))]
#[1] "final/G100_Quantile.csv" "final/G101_Quantile.csv"
#[3] "final/G102_Quantile.csv" "final/G103_Quantile.csv"
#[5] "final/G104_Quantile.csv" "final/G105_Quantile.csv"
#[7] "final/G106_Quantile.csv" "final/G107_Quantile.csv"
#[9] "final/G108_Quantile.csv" "final/G109_Quantile.csv"
#[11] "final/G110_Quantile.csv" "final/G111_Quantile.csv"
#[13] "final/G112_Quantile.csv" "final/G113_Quantile.csv"
#[15] "final/G114_Quantile.csv" "final/G115_Quantile.csv"
#[17] "final/G116_Quantile.csv" "final/G117_Quantile.csv"
#[19] "final/G118_Quantile.csv" "final/G119_Quantile.csv"
#[21] "final/G120_Quantile.csv" "final/GG10_Quantile.csv"
#[23] "final/GG11_Quantile.csv" "final/GG12_Quantile.csv"
#[25] "final/GG13_Quantile.csv" "final/GG14_Quantile.csv"
#[27] "final/GG15_Quantile.csv" "final/GG16_Quantile.csv"
#[29] "final/GG17_Quantile.csv" "final/GG18_Quantile.csv"
#[31] "final/GG19_Quantile.csv" "final/GG20_Quantile.csv"
#[33] "final/GG21_Quantile.csv" "final/GG22_Quantile.csv"
#[35] "final/GG23_Quantile.csv" "final/GG24_Quantile.csv"
#[37] "final/GG25_Quantile.csv" "final/GG26_Quantile.csv"
#[39] "final/GG27_Quantile.csv" "final/GG28_Quantile.csv"


ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", 
list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1,function(x) dim(x))
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#[1,]  101  101  101  101  101  101  101  101  101   101   101   101   101   101
#[2,]  258  258  258  258  258  258  258  258  258   258   258   258   258   258
#     [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#[1,]   101   101   101   101   101   101   101   101   101   101   101   101
#[2,]   258   258   258   258   258   258   258   258   258   258   258   258
#     [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38]
#[1,]   101   101   101   101   101   101   101   101   101   101   101   101
#[2,]   258   258   258   258   258   258   258   258   258   258   258   258
#     [,39] [,40]
#[1,]   101   101
#[2,]

Re: [R] system()

2014-04-14 Thread Daniel Nordlund

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Rui Barradas
> Sent: Monday, April 14, 2014 12:09 PM
> To: Doran, Harold; r-help@r-project.org
> Subject: Re: [R] system()
> 
> Hello,
> 
> Try instead
> 
> command <- paste(aa, fnm)
> system(command)
> 
> And read the help page for ?paste
> 
> Hope this helps,
> 
> Rui Barradas
> 
> 
> Em 14-04-2014 20:02, Doran, Harold escreveu:
> > I need to send a system command to another program from within R but
> have a small hangup
> >
> > I'm trying to do something like this
> >
> > system("notepad myfile.txt")
> >
> > But, more generally this is happening to multiple files, so I loop over
> thousands of files. For purposes of an example, my code is something like
> this, which does not work
> >
> > aa <- 'notepad.exe'
> > fnm <- 'myfile.txt'
> > system("aa fnm")
> >
> > Any suggestions?
> > Harold
> >

Harold,

you haven't said what OS you are running under, but given that your example 
program was notepad.exe I am going to guess some flavor of MS Windows.  The 
suggestion to use paste() is necessary, but it will probably not be sufficient 
to solve your problem.  The commands suggested

> command <- paste(aa, fnm)
> system(command)

freezes R on my Win 7 Pro x64 box using either 64-bit R-3.0.3 or R-3.1.0.  You 
might try switching to shell() instead of system()

> command <- paste(aa, fnm)
> shell(command)

However, it all depends on what programs you are trying to run and what 
behavior you expect.


Dan

Daniel Nordlund
Bothell, WA USA
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting values from rows which meet a condition in R 3.0.2

2014-04-14 Thread Jim Lemon


On 04/15/2014 06:24 AM, umair durrani wrote:

Hi, I have a big data frame with millions of rows and more than 20 columns. Let 
me first describe what the data is to make question more clear. The original 
data frame consists of locations, velocities and accelerations of 2169 vehicles 
during a 15 minute period. Each vehicle has a unique Vehicle.ID, an ID of the 
time frame in which it was observed i.e. Frame.ID, the velocity of vehicle in 
that frame i.e. svel, the acceleration of vehicle in that frame i.e. sacc and 
the class of that vehicle, vehicle.class, i.e. 1= motorcycle, 2= car, 3 = 
truck. These variables were recorded after every 0.1 seconds i.e. each frame is 
0.1 seconds. Here are the first 6 rows:

dput(head(df))structure(list(Vehicle.ID = c(2L, 2L, 2L, 2L, 2L, 2L), Frame.ID = 133:138,Vehicle.class = c(2L, 2L, 2L, 2L, 2L, 2L), Lane = c(2L, 2L, 
2L, 2L, 2L, 2L), svel = c(37.29, 37.11, 36.96, 36.83, 36.73,36.64), sacc = c(0.07, 0.11, 0.15, 0.19, 0.22, 0.25)), .Names = c("Vehicle.ID", 
"Frame.ID", "Vehicle.class", "Lane", "svel", "sacc"), row.names = 7750:7755, class = 
"data.frame")

There are some instances in vehicles' journey during the 15 minute recording 
period that they completely stop i.e. svel==0. This continues for some frames 
and then vehicles gain speed again. For the purpose of reproduciblity I am 
creating an example data set as follows:
x<- data.frame(Vehicle.ID = c(rep(10,5), rep(20,5), rep(30,5), rep(40,5), 
rep(50,5)),vehicle.class = c(rep(2,10), rep(3,10),rep(1,5)), svel = 
rep(c(1,0,0,0,3),5),   sacc = rep(c(0.3,0.001,0.001,0.002,0.5),5))
As described above some vehicles stop and have zero velocity for some time but 
later accelerate to get up to speed. I want to find the acceleration, sacc they 
apply after having zero velocity for some time (moving from standstill 
position). This means that I should be able to look at the FIRST row AFTER the 
last frame in which svel==0. In the example data this means that the car 
(vehicle.class==2) having a Vehicle.ID==10 had a velocity, svel equal to 1 as 
seen in the first row. Later, it stopped for 3 frames (3 consecutive rows) and 
then accelerated to velocity, svel, equal to 3. I want the acceleration sacc it 
applied in those 2 frames (rows 4 and 5 for vehicle 10, which come out to be 
0.002 and 0.500). This means that for example data, following should be the 
output by vehicle.class:
output<- data.frame(Vehicle.ID = c(10,10,20,20,30,30,40,40,50, 
50),vehicle.class = c(2,2,2,2,3,3,3,3,1,1), xf = rep(c('l','f'),10),sacc = 
rep(c(0.002,0.500),5))
xf identifies the last row l in which svel==0 and f is the first one after 
that. I have tried using plyr and for loop to split by vehicle.class but am not 
sure how to extract the sacc. Please note that xf should be a part of output. 
It is not in given data. The original data frame df has 2169 vehicles, some 
stopped and some did not so not all vehicles had svel==0. The vehicles which 
did stop didn't do it at the same time. Also, the number of rows in which 
svel==0 is different vehicle to vehicle.
Thanks,
Umair Durrani
Master's candidate
Civil and Environmental Engineering
University of Windsor   
[[alternative HTML version deleted]]


Hi Umair,
This may be a bit slow, but I think it will do what you want:

initacc<-function(x) {
 xout<-matrix(rep(NA,4),nrow=1)
 for(drow in 2:dim(x)[1]) {
  if(x[drow-1,"svel"] == 0 && x[drow,"svel"] > 0) {
   if(!is.na(xout[1,1])) {
xout<-rbind(xout,c(x[drow-1,"Vehicle.ID"],
 x[drow-1,"vehicle.class"],0,x[drow-1,"sacc"]))
   }
   else {
xout[1,]<-c(x[drow-1,"Vehicle.ID"],
 x[drow-1,"vehicle.class"],0,<-x[drow-1,"sacc"])
   }
   xout<-rbind(xout,c(x[drow,"Vehicle.ID"],
x[drow,"vehicle.class"],1,x[drow,"sacc"]))
  }
 }
 xout<-as.data.frame(xout)
 names(xout)<-
  c("Vehicle.ID","vehicle.class","xf","sacc")
 xout$xf<-ifelse(xout$xf,"f","l")
 return(xout)
}

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gRain on R 3.1.0

2014-04-14 Thread Uwe Ligges




On 14.04.2014 07:54, Anna Eklöf wrote:

I am using the gRain package but I can't get it to work under R 3.1.0. It is no 
longer available in the CRAN.
Does anyone have suggestions for how to get a successful installation?



See:
http://cran.r-project.org/web/packages/gRain/index.html

All there...

Best,
Uwe Ligges



A





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot mlogit object

2014-04-14 Thread Tim Marcella

Hi,

I cannot figure out how or if I even can plot the results from a nested
multinomial logit model. I am using the mlogit package.

Does anyone have a lead on any tutorials? Both of the vignettes are lacking
plotting instructions.

Thanks, Tim

-- 
Tim Marcella

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mclogit

2014-04-14 Thread Philippe Galipeau

Hi,
 
have you found a solution for your question 1?

Le jeudi 11 octobre 2012 19:03:16 UTC-4, Katelyn Weaver a écrit :

> Hello, 
>
> I am new to R and am trying to complete a mixed conditional logistic 
> regression. There are two issues that I am currently having: 
>
> 1. I am not sure how to insert the random effects variable into the 
> equation. My current equation is 
> model<-mclogit(Presence~AllWet+AllAg+strata(Pair)) 
> where Presence is a binary value (present or absent), AllWet and AllAg 
> shows the proportion of the location polygons covered by each habitat 
> type, 
> and Pair is showing the paired used and random polygons. The random 
> effects 
> that I want to control for are Bird ID (same bird at multiple locations). 
> Does anyone know how to write the formula properly to include the random 
> effects? 
>
> 2. When I enter the formula I keep getting Error: could not find function 
> "mclogit" 
> When I was using the clogit function I had to add the "survival" package 
> to 
> perform the analysis. What package do I have to add for mclogit? 
>
> Any assistance on this subject would be greatly appreciated. 
>
> Thank you, 
> Katelyn 
>
> -- 
> Katelyn Weaver 
> M.Sc. Candidate 
> Long Point Waterfowl 
> Western University 
> Cell: 519-619-4472 
> Email: kwe...@uwo.ca  
> www.longpointwaterfowl.org 
>
> [[alternative HTML version deleted]] 
>
> __ 
> r-h...@r-project.org  mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extracting values from rows which meet a condition in R 3.0.2

2014-04-14 Thread umair durrani

Hi, I have a big data frame with millions of rows and more than 20 columns. Let 
me first describe what the data is to make question more clear. The original 
data frame consists of locations, velocities and accelerations of 2169 vehicles 
during a 15 minute period. Each vehicle has a unique Vehicle.ID, an ID of the 
time frame in which it was observed i.e. Frame.ID, the velocity of vehicle in 
that frame i.e. svel, the acceleration of vehicle in that frame i.e. sacc and 
the class of that vehicle, vehicle.class, i.e. 1= motorcycle, 2= car, 3 = 
truck. These variables were recorded after every 0.1 seconds i.e. each frame is 
0.1 seconds. Here are the first 6 rows:
> dput(head(df))structure(list(Vehicle.ID = c(2L, 2L, 2L, 2L, 2L, 2L), Frame.ID 
> = 133:138,Vehicle.class = c(2L, 2L, 2L, 2L, 2L, 2L), Lane = c(2L, 2L, 2L, 2L, 
> 2L, 2L), svel = c(37.29, 37.11, 36.96, 36.83, 36.73,36.64), sacc = c(0.07, 
> 0.11, 0.15, 0.19, 0.22, 0.25)), .Names = c("Vehicle.ID", "Frame.ID", 
> "Vehicle.class", "Lane", "svel", "sacc"), row.names = 7750:7755, class = 
> "data.frame")
There are some instances in vehicles' journey during the 15 minute recording 
period that they completely stop i.e. svel==0. This continues for some frames 
and then vehicles gain speed again. For the purpose of reproduciblity I am 
creating an example data set as follows:
x <- data.frame(Vehicle.ID = c(rep(10,5), rep(20,5), rep(30,5), rep(40,5), 
rep(50,5)),vehicle.class = c(rep(2,10), rep(3,10),rep(1,5)), svel = 
rep(c(1,0,0,0,3),5),   sacc = rep(c(0.3,0.001,0.001,0.002,0.5),5))
As described above some vehicles stop and have zero velocity for some time but 
later accelerate to get up to speed. I want to find the acceleration, sacc they 
apply after having zero velocity for some time (moving from standstill 
position). This means that I should be able to look at the FIRST row AFTER the 
last frame in which svel==0. In the example data this means that the car 
(vehicle.class==2) having a Vehicle.ID==10 had a velocity, svel equal to 1 as 
seen in the first row. Later, it stopped for 3 frames (3 consecutive rows) and 
then accelerated to velocity, svel, equal to 3. I want the acceleration sacc it 
applied in those 2 frames (rows 4 and 5 for vehicle 10, which come out to be 
0.002 and 0.500). This means that for example data, following should be the 
output by vehicle.class:
output <- data.frame(Vehicle.ID = c(10,10,20,20,30,30,40,40,50, 
50),vehicle.class = c(2,2,2,2,3,3,3,3,1,1), xf = rep(c('l','f'),10),sacc = 
rep(c(0.002,0.500),5))
xf identifies the last row l in which svel==0 and f is the first one after 
that. I have tried using plyr and for loop to split by vehicle.class but am not 
sure how to extract the sacc. Please note that xf should be a part of output. 
It is not in given data. The original data frame df has 2169 vehicles, some 
stopped and some did not so not all vehicles had svel==0. The vehicles which 
did stop didn't do it at the same time. Also, the number of rows in which 
svel==0 is different vehicle to vehicle.
Thanks,
Umair Durrani
Master's candidate 
Civil and Environmental Engineering
University of Windsor 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sink() and UTF-8 on non-UTF-8 systems

2014-04-14 Thread Milan Bouchet-Valat

Suggestions, anyone?

Le vendredi 11 avril 2014 à 17:49 +0200, Milan Bouchet-Valat a écrit :
> Hi!
> 
> In the series "dealing with encoding madness on hostile systems", I'm
> looking for help as regards capturing R UTF-8 output on a system where
> the locale is not using UTF-8, and where some characters cannot even be
> represented using the locale encoding. The case I have in mind is
> printing a character vector with Russian text to the R Commander output
> window on an English/French (CP1252) Windows system.
> 
> Here's a code snippet illustrating the problem:
> > "\U41F"
> [1] "П" # OK
> > con <- file(open="w+", encoding="UTF-8")
> > capture.output(cat("\U41F"), file=con)
> > readLines(con, encoding="UTF-8")
> [1] "" # Not OK
> 
> (same result without specifying 'encoding')
> 
> 
> Now I have read ?sink and it is quite explicit about how this works:
> > If file is a character string, the file will be opened using the
> > current encoding. If you want a different encoding (e.g. to represent
> > strings which have been stored in UTF-8), use a file connection — but
> > some ways to produce R output will already have converted such strings
> > to the current encoding. 
> 
> The last words seem to apply to the case above, i.e. somewhere in the
> process the UTF-8 string is converted to the locale encoding. Is there
> any solution to get the correct output?
> 
> 
> Thanks
> 
> 
> > sessionInfo()
> R Under development (unstable) (2014-04-10 r65396)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> 
> locale:
> [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252   
> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C  
> [5] LC_TIME=French_France.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error using the package tm.plugin.webmining "object '.Source' not found"

2014-04-14 Thread Milan Bouchet-Valat

Le dimanche 13 avril 2014 à 20:50 -0400, brian arb a écrit :
> I recently had an issue while trying to use the package tm.plugin.webmining.
> 
> I was able to get a hack to work for me and I wanted to share the diff and
> bring this to someones attention.
> Or what is the proper way to report a bug for third party code?
You should contact the maintainer of the package. Apparently this
package was using tm's internal function .Source(), which was renamed
into Source() in tm 0.5-10.

You can likely work around this by defining this function first:
.Source <- function(defaultreader, encoding, length, lodsupport, names,
position, vectorized, class = NULL) {
if (vectorized && (length <= 0))
stop("vectorized sources must have positive length")

if (!is.null(names) && (length != length(names)))
stop("incorrect number of element names")

structure(list(DefaultReader = defaultreader, Encoding = encoding,
Length = length,
   LoDSupport = lodsupport, Names = names, Position =
position, Vectorized = vectorized),
  class = unique(c(class, "Source")))
}

If this doesn't work, you can install an older version of tm
(see http://cran.r-project.org/src/contrib/Archive/tm/).

Regards

> Cheers
> 
> # error I get when using the plugin. #
> Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
>   object '.Source' not found
> 
> # My Hack Diff #
> diff ~/Downloads/tm.plugin.webmining/R/source.R \
> > tm.plugin.webmining/R/source.R
> 35c35,36
> <  s <- tm:::.Source(NULL, encoding, length(content_parsed), FALSE, NULL,
> 0, vectorized, class = class)
> ---
> >  s <- tm:::Source(defaultReader=readPlain, encoding=encoding,
> length=length(content_parsed),
> >   names=NA_character_, position=0, vectorized=vectorized, class=class)
> <
> 569,570c570,572
> <  s <- tm:::.Source(NULL, encoding = "UTF-8", length(content), FALSE,
> NULL, 0, vectorized = FALSE, class = "WebXMLSource")
> <  s$Content <- content
> ---
> >  s <- tm:::Source(defaultReader=readPlain, encoding="UTF-8",
> length=length(content),
> >   names=NA_character_, position=0, vectorized=FALSE, class="WebXMLSource")
> 
> 
> Using R version 3.1.0 beta (2014-03-28 r65330) -- "Spring Dance"
> 
> # output from console ##
> 
> > library(quantmod)
> Loading required package: Defaults
> Loading required package: xts
> Loading required package: zoo
> 
> Attaching package: 'zoo'
> 
> The following objects are masked from 'package:base':
> 
> as.Date, as.Date.numeric
> 
> Loading required package: TTR
> Version 0.4-0 included new data defaults. See ?getSymbols.
> > library(rJava)
> > library(boilerpipeR)
> > library(namespace)
> > library(tm.plugin.webmining)
> Loading required package: tm
> Loading required package: RCurl
> Loading required package: bitops
> 
> Attaching package: 'RCurl'
> 
> The following object is masked from 'package:rJava':
> 
> clone
> 
> Loading required package: XML
> 
> Attaching package: 'tm.plugin.webmining'
> 
> The following object is masked from 'package:RCurl':
> 
> getURL
> 
> The following object is masked from 'package:base':
> 
> parse
> 
> >
> > corpus <- WebCorpus(GoogleFinanceSource("NASDAQ:MSFT"))
> Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
>   object '.Source' not found
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem on package "survey" , function svyglm,

2014-04-14 Thread Milan Bouchet-Valat

Le lundi 14 avril 2014 à 13:59 -0400, Hanze Zhang a écrit :
> Hi,
> 
> I want to do logistic regression based on a complex sample design. I used
> package survey, but when I ran svyglm, an error message came out:
> Error in onestrat(x[index, , drop = FALSE], clusters[index],
> nPSU[index][1],  :
>   Stratum (16) has only one PSU at stage 1
> 
> 
> My code is below:
> 
> a.design<-svydesign(id = ~CASENUM ,strata = ~STRATUM ,data = a ,weights =
> ~SIZAGYWT )
> summary(logistic1 <- svyglm(ANYCONTR ~ CHAIN+OWN+HPPAT, family =
> binomial(link = "logit"), design=a.design))
> 
> 
> How to solve this issue? Thank you.
You need to merge manually the stratum with only one PSU with another
stratum. See 3.2.1 in http://books.google.fr/books?id=L96ludyhFBsC
(look for "single" in the whole book to find it).

Regards

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] system()

2014-04-14 Thread Rui Barradas


Hello,

Try instead

command <- paste(aa, fnm)
system(command)

And read the help page for ?paste

Hope this helps,

Rui Barradas


Em 14-04-2014 20:02, Doran, Harold escreveu:

I need to send a system command to another program from within R but have a 
small hangup

I'm trying to do something like this

system("notepad myfile.txt")

But, more generally this is happening to multiple files, so I loop over 
thousands of files. For purposes of an example, my code is something like this, 
which does not work

aa <- 'notepad.exe'
fnm <- 'myfile.txt'
system("aa fnm")

Any suggestions?
Harold

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.0.3, Windows 7: Problem installing XML package

2014-04-14 Thread Alpesh Pandya

Thank you for response Rui.

I still get the same error with this repository.

Installing package into âC:/Users/APandya/Documents/R/win-library/3.0â
(as âlibâ is unspecified)
trying URL '
http://cran.dcc.fc.up.pt/bin/windows/contrib/3.0/XML_3.98-1.1.zip'
Content type 'application/zip' length 4288136 bytes (4.1 Mb)
opened URL
downloaded 4.1 Mb

Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type"))
:
  cannot open the connection
In addition: Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  downloaded length 4276224 != reported length 4288136
2: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
3: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
  cannot open compressed file 'XML/DESCRIPTION', probable reason 'No such
file or directory'



On Mon, Apr 14, 2014 at 2:17 PM, Rui Barradas  wrote:

> Hello,
> I have package XML installed on Windows 7, R 3.0.3 and I had no problem at
> all. Can't you try (it worked with me)
>
> install.packages("XML", repos = "http://cran.dcc.fc.up.pt";)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 14-04-2014 16:24, Alpesh Pandya escreveu:
>
>  I have tried these sources (almost all US mirrors):
>>
>> http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://cran.stat.ucla.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://streaming.stat.iastate.edu/CRAN/bin/windows/contrib/
>> 3.0/XML_3.98-1.1.zip
>> http://ftp.ussg.iu.edu/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://rweb.quant.ku.edu/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://watson.nci.nih.gov/cran_mirror/bin/windows/
>> contrib/3.0/XML_3.98-1.1.zip
>> http://cran.mtu.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://cran.wustl.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://cran.case.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://ftp.osuosl.org/pub/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>> http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
>>
>> I have confirmed with IT that there is no restriction on downloading this
>> zip file from any of these sources. Also I am getting same error when I
>> try
>> from my home network as well.
>>
>>
>> On Sun, Apr 13, 2014 at 9:46 AM, Uwe Ligges <
>> lig...@statistik.tu-dortmund.de
>>
>>> wrote:
>>>
>>
>>
>>>
>>> On 13.04.2014 01:30, Alpesh Pandya wrote:
>>>
>>>  @Uwe I tried the same steps from office as well as home network with
 same
 results. Are you using windows 7 with R 3.0.3?

 I have seen same question being asked by others without any resolution.
 Is
 anything special about XML package? I am OK use older version of package
 but in archives there are no zip files (only gz files). Is windows
 platform
 not recommended for R?


>>> Right, and you can try to install these from sources.
>>> But I doubt you need it. You still have not told us if you tried another
>>> mirror to download the XML file from and what you local IT support tells
>>> you while your downloads are incomplete.
>>>
>>> Best,
>>> Uwe Ligges
>>>
>>>
>>>
>>>
>>>
>>>
>>>
 On Sat, Apr 12, 2014 at 7:22 PM, Uwe Ligges <
 lig...@statistik.tu-dortmund.de

  wrote:
>
>


> On 12.04.2014 22:39, Alpesh Pandya wrote:
>
>   Thank you for response Uwe. I tried multiple times by downloading the
>
>> zip
>> file from many sources but still the same error. This is a major road
>> block
>> for me in using R. Appreciate any help on this.
>>
>>
>>  Please ask your local IT staff.
>
> I get, using the same mirror:
>
>   options("repos"=c(CRAN="http://watson.nci.nih.gov/cran_mirror";))
>
>> install.packages("XML", lib="d:/temp")
>>
>>
> trying URL 'http://watson.nci.nih.gov/cran_mirror/bin/windows/
>
> contrib/3.0/XML_3.98-1.1.zip'
> Content type 'application/zip' length 4288136 bytes (4.1 Mb)
> opened URL
> downloaded 4.1 Mb
>
> package 'XML' successfully unpacked and MD5 sums checked
>
> The downloaded binary packages are in
>   d:\temp\RtmpqMqL8L\downloaded_packages
>
>
>
> Best,
> Uwe Ligges
>
>
>
>
>
>
>
>
>  On Fri, Apr 11, 2014 at 6:53 PM, Uwe Ligges <
>> lig...@statistik.tu-dortmund.de
>>
>>   wrote:
>>
>>>
>>>
>>> Works for me.
>>
>>
>>> Best,
>>> Uwe Ligges
>>>
>>>
>>>
>>>
>>> On 11.04.2014 17:10, Alpesh Pandya wrote:
>>>
>>>Using install.package('XML') command produces this error:
>>>
>>>
 trying URL
 '
 http://watson.nci.nih.gov/cran_mirror/bin/windows/
 contrib/3.0/XML_3.98-1.1.zip
 '
 Content type 'application/zip' length 4288136 bytes (4.1 Mb)
 opened URL
 downloaded 4.1 Mb

 Error in read.dcf(file.path

[R] system()

2014-04-14 Thread Doran, Harold

I need to send a system command to another program from within R but have a 
small hangup

I'm trying to do something like this

system("notepad myfile.txt")

But, more generally this is happening to multiple files, so I loop over 
thousands of files. For purposes of an example, my code is something like this, 
which does not work

aa <- 'notepad.exe'
fnm <- 'myfile.txt'
system("aa fnm")

Any suggestions?
Harold

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.0.3, Windows 7: Problem installing XML package

2014-04-14 Thread Rui Barradas


Hello,
I have package XML installed on Windows 7, R 3.0.3 and I had no problem 
at all. Can't you try (it worked with me)


install.packages("XML", repos = "http://cran.dcc.fc.up.pt";)

Hope this helps,

Rui Barradas

Em 14-04-2014 16:24, Alpesh Pandya escreveu:

I have tried these sources (almost all US mirrors):

http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.stat.ucla.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://streaming.stat.iastate.edu/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://ftp.ussg.iu.edu/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://rweb.quant.ku.edu/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.mtu.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.wustl.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.case.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://ftp.osuosl.org/pub/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip

I have confirmed with IT that there is no restriction on downloading this
zip file from any of these sources. Also I am getting same error when I try
from my home network as well.


On Sun, Apr 13, 2014 at 9:46 AM, Uwe Ligges 
wrote:





On 13.04.2014 01:30, Alpesh Pandya wrote:


@Uwe I tried the same steps from office as well as home network with same
results. Are you using windows 7 with R 3.0.3?

I have seen same question being asked by others without any resolution. Is
anything special about XML package? I am OK use older version of package
but in archives there are no zip files (only gz files). Is windows
platform
not recommended for R?



Right, and you can try to install these from sources.
But I doubt you need it. You still have not told us if you tried another
mirror to download the XML file from and what you local IT support tells
you while your downloads are incomplete.

Best,
Uwe Ligges








On Sat, Apr 12, 2014 at 7:22 PM, Uwe Ligges <
lig...@statistik.tu-dortmund.de


wrote:






On 12.04.2014 22:39, Alpesh Pandya wrote:

  Thank you for response Uwe. I tried multiple times by downloading the

zip
file from many sources but still the same error. This is a major road
block
for me in using R. Appreciate any help on this.



Please ask your local IT staff.

I get, using the same mirror:

  options("repos"=c(CRAN="http://watson.nci.nih.gov/cran_mirror";))

install.packages("XML", lib="d:/temp")



trying URL 'http://watson.nci.nih.gov/cran_mirror/bin/windows/

contrib/3.0/XML_3.98-1.1.zip'
Content type 'application/zip' length 4288136 bytes (4.1 Mb)
opened URL
downloaded 4.1 Mb

package 'XML' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
  d:\temp\RtmpqMqL8L\downloaded_packages



Best,
Uwe Ligges









On Fri, Apr 11, 2014 at 6:53 PM, Uwe Ligges <
lig...@statistik.tu-dortmund.de

  wrote:




   Works for me.



Best,
Uwe Ligges




On 11.04.2014 17:10, Alpesh Pandya wrote:

   Using install.package('XML') command produces this error:



trying URL
'
http://watson.nci.nih.gov/cran_mirror/bin/windows/
contrib/3.0/XML_3.98-1.1.zip
'
Content type 'application/zip' length 4288136 bytes (4.1 Mb)
opened URL
downloaded 4.1 Mb

Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package",
"Type")) :
  cannot open the connection
In addition: Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  downloaded length 4276224 != reported length 4288136
2: In unzip(zipname, exdir = dest) : error 1 in extracting from zip
file
3: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package",
"Type"))
:
  cannot open compressed file 'XML/DESCRIPTION', probable reason
'No
such
file or directory'


Upon receiving this error, I downloaded XML_3.98-1.1.zip directly from
cran
site. But this zip file is not a valid archive (cannot open using
winzip).
Also trying to install using this downloaded file produces the
following
error:

Installing package into 'C:/Users/APandya/Documents/R/
win-library/3.0'
(as 'lib' is unspecified)
Warning in install.packages :
  error 1 in extracting from zip file
Warning in install.packages :
  cannot open compressed file 'XML/DESCRIPTION', probable reason
'No
such
file or directory'
Error in install.packages : cannot open the connection

I  downloaded this zip file from multiple sources and tried to install
with
same result.















__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem on package "survey" , function svyglm,

2014-04-14 Thread Hanze Zhang

Hi,

I want to do logistic regression based on a complex sample design. I used
package survey, but when I ran svyglm, an error message came out:
Error in onestrat(x[index, , drop = FALSE], clusters[index],
nPSU[index][1],  :
  Stratum (16) has only one PSU at stage 1


My code is below:

a.design<-svydesign(id = ~CASENUM ,strata = ~STRATUM ,data = a ,weights =
~SIZAGYWT )
summary(logistic1 <- svyglm(ANYCONTR ~ CHAIN+OWN+HPPAT, family =
binomial(link = "logit"), design=a.design))


How to solve this issue? Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] X11 font -adobe-helvetica-%s-%s---%d-------*, face 2 at size 11 could not be loaded

2014-04-14 Thread Florian Burkart

Hi,

I am on Ubuntu 13.10, with R version 3.1.0 beta (2014-03-28 r65330) --
"Spring Dance" (64 bit).

It is a new install and when trying to plot I am getting above error message

Error in title(main = "Test", line = -1) :
  X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 2 at size 11
could not be loaded

This is happening with X11(type="Xlib")

I looked around a bit, but could only find quite old threads. Prof. Ripley
replied to a similar issue in 2013 with

*See ?X11 and the 'R Installation and Administration Manual'. You are
dragging up ancient history (2002). The 'modern' X11 device (from 2007)
uses cairographics and does not use X11 fonts. I suggest you take a look at
how R was built and ensure that the cairo-based device is available.
Further, for a long time most X11 installations have been from Xorg and not
Xfree86, and do not generally have a config file.*

That unfortunately doesn't help me, because I am using events:

setGraphicsEventHandlers(prompt="Click and drag to zoom, hit q to quit",
   onMouseDown = dragmousedown,
   onMouseUp = mouseup,
   onKeybd = keydown)
eventEnv <- getGraphicsEventEnv()
getGraphicsEvent()

And those aren't supported on any other device. It is still working on my
other machine, so I presume I just need to find fonts somewhere.

How do I install or generate those fonts on Ubuntu?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hetglm() and robust standard errors

2014-04-14 Thread ChrisR

Hi everyone,
I am using the hetglm() command from the package 'glmx' (0.1-0). It seems
that hetglm() is incompatible with the robust standard errors estimator
provided in the 'AER' package: coeftest(mymodel,vcov=vcovHC)
Any suggestions how I could obtain robust standard errors for the
heteroscedastic probit?

Thanks,
Chris



--
View this message in context: 
http://r.789695.n4.nabble.com/hetglm-and-robust-standard-errors-tp4688737.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean calculations from a dframe column

2014-04-14 Thread andre.zacha...@gmail.com

hello,

yes I will try colmeans, I thought it will not get rid of th NA's

But thank you for the advice!

Andre

2014-04-14 13:25 GMT+01:00 David McPearson [via R] <
ml-node+s789695n4688731...@n4.nabble.com>:

> On Sun, 13 Apr 2014 05:01:40 -0700 (PDT) "[hidden 
> email]"
>
> <[hidden email] >
> wrote
>
> > Thank you very much!!
> >
> ..
> ..
> >  *De :* arun kirshna [via R]
> > *EnvoyÃ© :* 13 avril 2014 11:23
> > *Ã :* [hidden email]
> > *Objet :* Re: mean calculations from a dframe column
> >
> > Hi AndrÃ©,
> >
> > Your codes were missing in some information. If your code looks like
> this:
> ..
> ..
> > Mean <- apply(a, 2, mean, na.rm = TRUE)
> ..
> ..
> ..
>
> Andre,
>
> Just for future reference, have a look at
> ?colMeans
> "These functions are equivalent to use of apply with FUN = mean or FUN =
> sum
> with appropriate margins, but are a lot faster."
>
> Cheers,
> D.
>
> 
> South Africas premier free email service - www.webmail.co.za
>
> Ensure Quality Health Care For All http://www.anc.org.za/2014/manifesto/
>
> __
> [hidden email] mailing 
> list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> --
>   If you reply to this email, your message will be added to the
> discussion below:
>
> http://r.789695.n4.nabble.com/mean-calculations-from-a-dframe-column-tp4688674p4688731.html
>  To unsubscribe from mean calculations from a dframe column, click 
> here
> .
> NAML
>




-
AndrÃ©
--
View this message in context: 
http://r.789695.n4.nabble.com/mean-calculations-from-a-dframe-column-tp4688674p4688734.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] I can't programe routine comp()

2014-04-14 Thread Endy

    Dear R users. 
I am trying to program the comp() routine in package survMisc.                  
                                                                                
          
I am reading the data below with d=read.table( "C:\\. . 
.",fill=TRUE,header=TRUE)
Then I load the packages 'survival' and 'survMisc', library(survival), 
library(survMisc)
 and I run the commands
                                      s=survfit(Surv(d[,2], d[,3])~d[,1], 
data=d)
                                      comp(s)                                   
                
 and I am getting the error
                                       Error in get(t1, loc1) : object 'd[, 2]' 
not found
If instead I use the commands
                                        s=survfit(Surv(T, Status)~Group, data=d)
                                       comp(s)                                  
                   
routine comp()  runs perfectly. However when I am programing I can't see a way 
to know 
in advance the variable names in order to use them.
Can anybody  give me a suggestion?

                     Thanks in advance
                       Endy                        

NB. The data must be stacked in three (3) columns before red.
They are repeated in nine (9) columns for space saving.

GroupTStatusGroupTStatusGroupTStatus

120810155124141
116020111222041
11496011071210631
1146201110124811
1143301332121051
11377022569026411
11330022506023901
1996022409022881
1226022218024211
1119902185702791
1021829027481
1530021562024861
1118202147002481
11167021363022721
14181210300210741
138312860023811
127612125802101
110412224602531
160912187002801
117212179902351
1487121709022481
1662121674027041
1194121568022111
1230121527022191
1526121324026061
1122129570   
1129129320   
174128470   
1122128480   
1861218500   
14661218430   
11921215350   
11091214470   
1551213840
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 3.0.3, Windows 7: Problem installing XML package

2014-04-14 Thread Alpesh Pandya

I have tried these sources (almost all US mirrors):

http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.stat.ucla.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://streaming.stat.iastate.edu/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://ftp.ussg.iu.edu/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://rweb.quant.ku.edu/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://watson.nci.nih.gov/cran_mirror/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.mtu.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.wustl.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://cran.case.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://ftp.osuosl.org/pub/cran/bin/windows/contrib/3.0/XML_3.98-1.1.zip
http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/3.0/XML_3.98-1.1.zip

I have confirmed with IT that there is no restriction on downloading this
zip file from any of these sources. Also I am getting same error when I try
from my home network as well.


On Sun, Apr 13, 2014 at 9:46 AM, Uwe Ligges  wrote:

>
>
> On 13.04.2014 01:30, Alpesh Pandya wrote:
>
>> @Uwe I tried the same steps from office as well as home network with same
>> results. Are you using windows 7 with R 3.0.3?
>>
>> I have seen same question being asked by others without any resolution. Is
>> anything special about XML package? I am OK use older version of package
>> but in archives there are no zip files (only gz files). Is windows
>> platform
>> not recommended for R?
>>
>
> Right, and you can try to install these from sources.
> But I doubt you need it. You still have not told us if you tried another
> mirror to download the XML file from and what you local IT support tells
> you while your downloads are incomplete.
>
> Best,
> Uwe Ligges
>
>
>
>
>
>
>>
>> On Sat, Apr 12, 2014 at 7:22 PM, Uwe Ligges <
>> lig...@statistik.tu-dortmund.de
>>
>>> wrote:
>>>
>>
>>
>>>
>>> On 12.04.2014 22:39, Alpesh Pandya wrote:
>>>
>>>  Thank you for response Uwe. I tried multiple times by downloading the
 zip
 file from many sources but still the same error. This is a major road
 block
 for me in using R. Appreciate any help on this.


>>> Please ask your local IT staff.
>>>
>>> I get, using the same mirror:
>>>
>>>  options("repos"=c(CRAN="http://watson.nci.nih.gov/cran_mirror";))
 install.packages("XML", lib="d:/temp")

>>>
>>> trying URL 'http://watson.nci.nih.gov/cran_mirror/bin/windows/
>>>
>>> contrib/3.0/XML_3.98-1.1.zip'
>>> Content type 'application/zip' length 4288136 bytes (4.1 Mb)
>>> opened URL
>>> downloaded 4.1 Mb
>>>
>>> package 'XML' successfully unpacked and MD5 sums checked
>>>
>>> The downloaded binary packages are in
>>>  d:\temp\RtmpqMqL8L\downloaded_packages
>>>
>>>
>>>
>>> Best,
>>> Uwe Ligges
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
 On Fri, Apr 11, 2014 at 6:53 PM, Uwe Ligges <
 lig...@statistik.tu-dortmund.de

  wrote:
>
>
   Works for me.

>
> Best,
> Uwe Ligges
>
>
>
>
> On 11.04.2014 17:10, Alpesh Pandya wrote:
>
>   Using install.package('XML') command produces this error:
>
>>
>> trying URL
>> '
>> http://watson.nci.nih.gov/cran_mirror/bin/windows/
>> contrib/3.0/XML_3.98-1.1.zip
>> '
>> Content type 'application/zip' length 4288136 bytes (4.1 Mb)
>> opened URL
>> downloaded 4.1 Mb
>>
>> Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package",
>> "Type")) :
>>  cannot open the connection
>> In addition: Warning messages:
>> 1: In download.file(url, destfile, method, mode = "wb", ...) :
>>  downloaded length 4276224 != reported length 4288136
>> 2: In unzip(zipname, exdir = dest) : error 1 in extracting from zip
>> file
>> 3: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package",
>> "Type"))
>> :
>>  cannot open compressed file 'XML/DESCRIPTION', probable reason
>> 'No
>> such
>> file or directory'
>>
>>
>> Upon receiving this error, I downloaded XML_3.98-1.1.zip directly from
>> cran
>> site. But this zip file is not a valid archive (cannot open using
>> winzip).
>> Also trying to install using this downloaded file produces the
>> following
>> error:
>>
>> Installing package into 'C:/Users/APandya/Documents/R/
>> win-library/3.0'
>> (as 'lib' is unspecified)
>> Warning in install.packages :
>>  error 1 in extracting from zip file
>> Warning in install.packages :
>>  cannot open compressed file 'XML/DESCRIPTION', probable reason
>> 'No
>> such
>> file or directory'
>> Error in install.packages : cannot open the connection
>>
>> I  downloaded this zip file from multiple sources and tried to install
>> with
>> same result.
>>
>>
>>
>>


>>
>>


-- 
Thanks and Regards
Alpesh

[[alternative HTML version deleted]]

_

Re: [R] PGLM Package: Starting Values for Within-Model

2014-04-14 Thread jaschwer

Does anyone found an answer to that question? I have the same problem and 
can't find a solution.

Am Freitag, 10. Mai 2013 10:41:23 UTC+2 schrieb MaxFranke:
>
> I am currently using the PGLM package and I would like to implement a 
> within-model. Unfortunately, I do not succeed as I am not a big expert in 
> panel regression. 
>
> I am using the example data set from the PGLM package: 
>
> library(pglm) 
> data('Unions', package = 'pglm') 
> anb <- pglm(union~wage+exper+rural, Unions, family=binomial('probit'), 
> model="within",  method = "bfgs", print.level=0, R = 5, iterlim=2) 
>
> The pglm-function needs the argument start for starting values. What do I 
> use? 
>
> Thank you very much! 
>
>
>
> -- 
> View this message in context: 
> http://r.789695.n4.nabble.com/PGLM-Package-Starting-Values-for-Within-Model-tp4666739.html
>  
> Sent from the R help mailing list archive at Nabble.com. 
>
> __ 
> r-h...@r-project.org  mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Trend test for hazard ratios

2014-04-14 Thread mkleber74

Hello,

I have the following problem. I stratified my patient cohort into three
ordered groups and performed multivariate adjusted Cox regression analysis
on each group separately. Now I would like to calculate a p for trend across
the hazard ratios that I got for the three groups. How can I do that if I
only have the HR and the confidence interval? For example I got the
following HRs for one endpoint: 

1.09(0.68-1.74),1.29(0.94-1.76) and 1.64(1.01-2.68). 

There is a trend but how do I calculate if it is significant?

Best regards

Marcus Kleber



--
View this message in context: 
http://r.789695.n4.nabble.com/Trend-test-for-hazard-ratios-tp4688729.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to do abstraction of the document using R

2014-04-14 Thread goodnewsformood

Hi,
 
I have a document without the abstraction. I am tring to do abstraction of the 
document using R.  How can I get a short abstraction. I have no any idea about 
this. Anyone can help? Thank you!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to do poisson regression for a complex sample design data by svyglm

2014-04-14 Thread Hanze Zhang

Hi,

I am in trouble on doing poisson regression for a complex sample design
data by svyglm (survey package). I have a dataset contained these
variables: caseumber, y(which is count), x1, x2, stratum, weight. It is a
complex sample design with equal probability without replacement. How can I
fit a poisson regression like this "y=x1 x2" by svyglm?

Thank you,
Hanze

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] program

2014-04-14 Thread kafi dano

Dear Sir.
I need your to help me to correct the attached R-code. 

when I apply this code give me the bad result 


Attached the program by using R

Thank you 









 
Kafi Dano Pati
Ph.D candidate ( mathematics/statistics)
Department of mathematical Science/ faculty of Science
University Technology Malaysia
81310 UTM, Johor Bahru, Johor, Malaysia
IC. NO. 201202F10234
Matric No. PS113113
HP. No. 00601117517559
E-mail: kafi_d...@yahoo.com
supervisor- Assoc. Prof. Robiah Binti Adnan__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gRain on R 3.1.0

2014-04-14 Thread Anna Eklöf

I am using the gRain package but I can't get it to work under R 3.1.0. It is no 
longer available in the CRAN.
Does anyone have suggestions for how to get a successful installation?

A





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read.table mucks up headers

2014-04-14 Thread Milan Bouchet-Valat


Le lundi 14 avril 2014 à 08:50 -0700, Jeff Newmiller a écrit :
> You have not posted the input to your code so it is not reproducible.
> Also, you have posted in HTML, which is notorious for corrupting R
> code and data in emails.
> 
> If I were to guess, though, it looks to me like you are working with a
> UTF-8 file with a Byte Order Mark (header). I don't know the "correct"
> way to address this, but if you only have plain text data then the
> rest of your table should be valid and you could reset the name of the
> first column yourself like
> 
> names(sim)[1] <- "X"
If the byte order mark is the problem, then the natural way of handling
this is to pass fileEncoding="UTF-8-BOM" to read.table().


Regards

> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> --- 
> Sent from my phone. Please excuse my brevity.
> 
> On April 14, 2014 7:52:11 AM PDT, Pavneet Arora 
>  wrote:
> >Hey All
> >
> >I am trying to read in a small text file using read.table. 
> >
> >dput(sim)
> >structure(list(��...X. = 1:7, Y1 = c(2.5501, 4.105, 5.4687, 7.0009, 
> >8.5066, 9.785, 11.5167), Y2 = c(2.5501, 4.105, 5.4687, 11.0009, 
> >8.5066, 9.785, 11.5167), Y3 = c(2.5501, 4.105, 5.4687, 7.0009, 
> >8.5066, 9.785, 15.5167)), .Names = c("��...X.", "Y1", "Y2", "Y3"
> >), class = "data.frame", row.names = c(NA, -7L))
> >
> >But for some reason my first header comes as "?...X. ", instead of just
> >
> >"X". Can some one please tell me why? And how to fix it?
> >
> >This is what I did in my code, using R-studio
> >> sim<-read.table("C:/00Pavneet/# MSc/# temp/sim data 1.txt",header=T)
> >> View(sim)
> >
> >
> >***
> >MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No.
> >93792). Registered in England and Wales at St. Mark���s Court, Chart
> >Way, Horsham, West Sussex, RH12 1XL. 
> >
> >Authorised by the Prudential Regulation Authority and regulated by the
> >Financial Conduct Authority and the Prudential Regulation Authority.
> >
> >
> > [[alternative HTML version deleted]]
> >
> >
> >
> >
> >
> >__
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read.table mucks up headers

2014-04-14 Thread S Ellison

> But for some reason my first header comes as "?...X. ", instead of just "X".
> Can some one please tell me why? And how to fix it?

i) What were the separator characters in the original data file header row? 
ii) Your first character is not being decoded properly; check the file encoding 
and set accordingly?
iii) See if the offending first character(s) can sensibly be stripped before 
reading the file?

S Ellison



***
This email and any attachments are confidential. Any use, copying or
disclosure other than by the intended recipient is unauthorised. If 
you have received this message in error, please notify the sender 
immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
and delete this message and any copies from your computer and network. 
LGC Limited. Registered in England 2991879. 
Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Read.table mucks up headers

2014-04-14 Thread Jeff Newmiller

You have not posted the input to your code so it is not reproducible. Also, you 
have posted in HTML, which is notorious for corrupting R code and data in 
emails.

If I were to guess, though, it looks to me like you are working with a UTF-8 
file with a Byte Order Mark (header). I don't know the "correct" way to address 
this, but if you only have plain text data then the rest of your table should 
be valid and you could reset the name of the first column yourself like

names(sim)[1] <- "X"
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 14, 2014 7:52:11 AM PDT, Pavneet Arora  
wrote:
>Hey All
>
>I am trying to read in a small text file using read.table. 
>
>dput(sim)
>structure(list(��...X. = 1:7, Y1 = c(2.5501, 4.105, 5.4687, 7.0009, 
>8.5066, 9.785, 11.5167), Y2 = c(2.5501, 4.105, 5.4687, 11.0009, 
>8.5066, 9.785, 11.5167), Y3 = c(2.5501, 4.105, 5.4687, 7.0009, 
>8.5066, 9.785, 15.5167)), .Names = c("��...X.", "Y1", "Y2", "Y3"
>), class = "data.frame", row.names = c(NA, -7L))
>
>But for some reason my first header comes as "?...X. ", instead of just
>
>"X". Can some one please tell me why? And how to fix it?
>
>This is what I did in my code, using R-studio
>> sim<-read.table("C:/00Pavneet/# MSc/# temp/sim data 1.txt",header=T)
>> View(sim)
>
>
>***
>MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No.
>93792). Registered in England and Wales at St. Mark���s Court, Chart
>Way, Horsham, West Sussex, RH12 1XL. 
>
>Authorised by the Prudential Regulation Authority and regulated by the
>Financial Conduct Authority and the Prudential Regulation Authority.
>
>
>   [[alternative HTML version deleted]]
>
>
>
>
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting variables in a multivariate regression

2014-04-14 Thread Edson Tirelli

   Bert,

   I am sorry for having troubled you. The double post was a mistake
because gmail sent the first e-mail in html and it went into
moderation queue. The second one was sent in plain text. I did not
know the moderator approved my first post.

1. As I said, I am a beginner in statistics and in R. I did spent ~8
hours yesterday googling around, reading tutorials and testing out
solutions. I also completed a couple coursera courses on data analysis
and R programming over the last few months, but since I was not able
to solve the problem by myself, I was hoping the friendly R community
would help me.

2. Please see my comments to (1).

   Having said that, as a beginner of R, I had no idea there was a
package called "lavaan" that easily solves my problem. Someone else
pointed that to me over stackoverflow, and with his help I was able to
solve my problem:

http://stackoverflow.com/questions/23048501/selecting-variables-in-a-multivariate-regression-in-r

   Please note that my actual problem is quite more complex than that,
with 15 independent variables and 11 dependent ones. My toy example in
this question was my attempt to simplify the question so that people
with experience could point me in the right direction.

   Thank you anyway, I won't bother you again.

   Edson

On Mon, Apr 14, 2014 at 9:33 AM, Bert Gunter  wrote:
> Well, this is your second post on the same topic, your first having
> received no response. So you should suspect something is amiss and
> reconsider before continuing, don't you think?
>
> 1. I, for one, was not able to make any sense of your query. You do
> not appear to understand regression, so I would suggest you spend time
> with a local statistical resource before continuing with online
> posts.If my understanding of your misunderstanding is correct, you
> need to comprehend basics. If not,apologies.
>
> 2. Have you read An Introduction to R (ships with R) or an online R
> tutorial of your choice? If not, do so before posting here further. We
> expect minimal efforts of posters to solve their own problems before
> posting. Again, apologies if I err.
>
> Cheers,
> Bert
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
>
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> H. Gilbert Welch
>
>
>
>
> On Sun, Apr 13, 2014 at 8:08 PM, Edson Tirelli  wrote:
>> I am quite new to R and I am having trouble figuring out how to select
>> variables in a multivariate linear regression in R. My google-fu also
>> did not find anything.
>>
>> Pretend I have the following formulas:
>>
>> P = aX + bY
>> Q = cZ + bY
>>
>> I have a data frame with column P, Q, X, Y, Z and I need to find a, b and c.
>>
>> If I do a simple multivariate regression:
>>
>> result <- lm( cbind( P, Q ) ~ X + Y + Z - 1 )
>>
>> It calculates a coefficient for "c" on P's regression and for "a" on
>> Q's regression.
>>
>> If I calculate the regressions individually then "b" will be different
>> in each regression.
>>
>> How can I select the variables to consider in a multivariate
>> regression? I.e., how do I tell R to ignore cZ when calculating P, and
>> ignore aX when calculating Q?
>>
>> Thank you,
>> Edson
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

-- 
  Edson Tirelli
  Principal Software Engineer
  Red Hat Business Systems and Intelligence Group

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Growth of CRAN?

2014-04-14 Thread Spencer Graves


Hi, Hadley, et al.:


  Is anyone interested in mining the link Hadley provided below to 
compile a data.frame of the growth of the size of CRAN? I'll happily add 
it to the Ecdat package unless someone would rather put it someplace 
else.  It would be great for a study using, e.g., Bayesian Model 
Averaging to forecast the growth.  I got good results using the "drc" 
package, but findFn("growth") identified others that could ultimately be 
more useful.



  Spencer


On 4/14/2014 8:15 AM, Hadley Wickham wrote:

Yes, because it has every version of every DESCRIPTION.

Hadley

On Mon, Apr 14, 2014 at 11:13 AM, Spencer Graves
 wrote:

Hi, Hadley:



On 4/14/2014 5:53 AM, Hadley Wickham wrote:

For finer level detail, have a look at
https://github.com/hadley/cran-packages. It contains the description
file of every package ever uploaded to CRAN (the cache is a few months
out of date, but you can easily re-run)



   Can that be mined to get the date of first commit to CRAN? Dates in
description files are sometimes updated.  Example:  The "Date" in the
Description file for the version of "fda" now on CRAN is 2013-05, but fda
was on CRAN at least in 2006 and probably earlier.  You are listed as second
author on that package based on work you did in by 2006.


   Spencer



Hadley

On Sun, Apr 13, 2014 at 12:59 PM, Spencer Graves
 wrote:

What data exist on the growth of CRAN?


John Fox published some data on it in 2009 ("Aspects of the Social
Organization and Trajectory of the R Project", R Journal,
http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf).
Below
please find those numbers plus some additions of mine since. If anyone
else
has other numbers (or more accurate numbers), I'd like to know.


I plan to add a "CRANpackages" data set to the "Ecdat" package
with a
title, "Growth of CRAN" unless someone else provides better numbers or
title.  [If it already exists on CRAN, I'd like to know. I doubt if it
does,
because I couldn't find it with findFn{sos} for 'number of CRAN packages'
and 'growth of CRAN'.]


Thanks,
Spencer


datepackages
2001-06-21110
2001-12-17129
2002-06-12162
2003-05-27219
2003-11-16273
2004-06-05357
2004-10-12406
2005-06-18548
2005-12-16647
2006-05-31739
2006-12-12911
2007-04-121000
2007-11-161300
2008-03-181427
2008-10-181614
2009-09-171952
2012-06-123786
2012-11-014082
2012-12-144210
2013-10-284960
2013-11-085000
2014-04-135428


* NOTE:  These numbers may differ slightly from other sources. Numbers
through 2009 were read from a plot in Fox's paper.  Numbers since were
read
at an arbitrary time during the day, Pacific time, from a mirror in the
United States and could differ from those recorded by someone else using
a
different mirror having the same date in local time  someplace else.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing initial eigenvalues to broken stick results

2014-04-14 Thread dcarlson

It helps a great deal if you provide a small data set using
dput() and indicate what packages need to be loaded for the
functions you are using. This example uses random data so there
are no eigenvalues above the initial broken stick values:

> set.seed(42)
> x <- matrix(rnorm(200), 20, 10)
> require(vegan)
> bs <- rle(eigen(cor(x))$values>bstick(10,tot.var=10))
> as.vector(ifelse(bs$values[1]==TRUE, bs$lengths[1], 0))
[1] 0

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Allyson
Combes
Sent: Sunday, April 13, 2014 11:01 PM
To: r-help@R-project.org
Subject: [R] Comparing initial eigenvalues to broken stick
results

I am trying to create a function that will allow me to determine
the number of components to retain based on the results of the
broken stick criterion.  In order to do so I know I need to
compare the initial eigen values to the broken stick eigen
values.  The initial eigen value which becomes lower than the
broken stick is the cutoff point so the cutoff for the number of
components to retain is the number of eigen values before this
cutoff points.   




So far this is the syntax I have and what I get.




ev <- eigen(cor(EFAexample)) 
ev 
bstick(24, tot.var=24)


> ev
$values
 [1] 7.2819381 2.3951299 1.8190170 1.605 0.9862474 0.9092235
0.8269931
 [8] 0.7861644 0.6978157 0.6824547 0.6333925 0.5997783 0.5737571
0.5347758
[15] 0.4976710 0.4849214 0.4502025 0.4223273 0.3819080 0.3599697
0.3353226
[22] 0.3184069 0.2146300 0.2022866


> bstick(24, tot.var=24)
Stick1 Stick2 Stick3 Stick4 Stick5
Stick6 Stick7 
3.77595818 2.77595818 2.27595818 1.94262484 1.69262484
1.49262484 1.32595818 
Stick8 Stick9Stick10Stick11Stick12
Stick13Stick14 
1.18310103 1.05810103 0.94698992 0.84698992 0.75608083
0.67274750 0.59582442 
   Stick15Stick16Stick17Stick18Stick19
Stick20Stick21 
0.52439585 0.45772918 0.39522918 0.33640566 0.28085010
0.22821852 0.17821852 
   Stick22Stick23Stick24 
0.13059947 0.08514493 0.0417 


In this case the cutoff would be at stick 2 thus we would only
retain 1 component.  What syntax can I use to automatically make
this comparison instead of having to do it manually each time?


Also, am I using bstick correctly?  From what I understand I
should just have to enter the number of components and the total
variance which will be the total number of components.


Any help would be greatly appreciated.  Thanks!


Allyson
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Growth of CRAN?

2014-04-14 Thread Spencer Graves


Hi, Hadley:


On 4/14/2014 5:53 AM, Hadley Wickham wrote:

For finer level detail, have a look at
https://github.com/hadley/cran-packages. It contains the description
file of every package ever uploaded to CRAN (the cache is a few months
out of date, but you can easily re-run)



  Can that be mined to get the date of first commit to CRAN? Dates 
in description files are sometimes updated.  Example:  The "Date" in the 
Description file for the version of "fda" now on CRAN is 2013-05, but 
fda was on CRAN at least in 2006 and probably earlier.  You are listed 
as second author on that package based on work you did in by 2006.



  Spencer


Hadley

On Sun, Apr 13, 2014 at 12:59 PM, Spencer Graves
 wrote:

   What data exist on the growth of CRAN?


   John Fox published some data on it in 2009 ("Aspects of the Social
Organization and Trajectory of the R Project", R Journal,
http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf). Below
please find those numbers plus some additions of mine since. If anyone else
has other numbers (or more accurate numbers), I'd like to know.


   I plan to add a "CRANpackages" data set to the "Ecdat" package with a
title, "Growth of CRAN" unless someone else provides better numbers or
title.  [If it already exists on CRAN, I'd like to know. I doubt if it does,
because I couldn't find it with findFn{sos} for 'number of CRAN packages'
and 'growth of CRAN'.]


   Thanks,
   Spencer


datepackages
2001-06-21110
2001-12-17129
2002-06-12162
2003-05-27219
2003-11-16273
2004-06-05357
2004-10-12406
2005-06-18548
2005-12-16647
2006-05-31739
2006-12-12911
2007-04-121000
2007-11-161300
2008-03-181427
2008-10-181614
2009-09-171952
2012-06-123786
2012-11-014082
2012-12-144210
2013-10-284960
2013-11-085000
2014-04-135428


* NOTE:  These numbers may differ slightly from other sources. Numbers
through 2009 were read from a plot in Fox's paper.  Numbers since were read
at an arbitrary time during the day, Pacific time, from a mirror in the
United States and could differ from those recorded by someone else using a
different mirror having the same date in local time  someplace else.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Growth of CRAN?

2014-04-14 Thread Spencer Graves

p.s.  Have backup copies of CRAN been made, e.g., annually or with each 
new release?  Or are there summaries of the tests done with each new 
release?  I'm looking for something that could be checked to compile a 
more accurate and consistent database than what we have now.  For me, 
this is idle curiosity.  However, this could be used to help plan the 
expansion of CRAN, providing forecasts with confidence bounds.






Hi, Hadley:


On 4/14/2014 5:53 AM, Hadley Wickham wrote:

For finer level detail, have a look at
https://github.com/hadley/cran-packages. It contains the description
file of every package ever uploaded to CRAN (the cache is a few months
out of date, but you can easily re-run)



  Can that be mined to get the date of first commit to CRAN? Dates 
in description files are sometimes updated.  Example:  The "Date" in the 
Description file for the version of "fda" now on CRAN is 2013-05, but 
fda was on CRAN at least in 2006 and probably earlier.  You are listed 
as second author on that package based on work you did in by 2006.



  Spencer


Hadley

On Sun, Apr 13, 2014 at 12:59 PM, Spencer Graves
 wrote:

   What data exist on the growth of CRAN?


   John Fox published some data on it in 2009 ("Aspects of the Social
Organization and Trajectory of the R Project", R Journal,
http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf). Below
please find those numbers plus some additions of mine since. If anyone else
has other numbers (or more accurate numbers), I'd like to know.


   I plan to add a "CRANpackages" data set to the "Ecdat" package with a
title, "Growth of CRAN" unless someone else provides better numbers or
title.  [If it already exists on CRAN, I'd like to know. I doubt if it does,
because I couldn't find it with findFn{sos} for 'number of CRAN packages'
and 'growth of CRAN'.]


   Thanks,
   Spencer


datepackages
2001-06-21110
2001-12-17129
2002-06-12162
2003-05-27219
2003-11-16273
2004-06-05357
2004-10-12406
2005-06-18548
2005-12-16647
2006-05-31739
2006-12-12911
2007-04-121000
2007-11-161300
2008-03-181427
2008-10-181614
2009-09-171952
2012-06-123786
2012-11-014082
2012-12-144210
2013-10-284960
2013-11-085000
2014-04-135428


* NOTE:  These numbers may differ slightly from other sources. Numbers
through 2009 were read from a plot in Fox's paper.  Numbers since were read
at an arbitrary time during the day, Pacific time, from a mirror in the
United States and could differ from those recorded by someone else using a
different mirror having the same date in local time  someplace else.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Growth of CRAN?

2014-04-14 Thread Hadley Wickham

Yes, because it has every version of every DESCRIPTION.

Hadley

On Mon, Apr 14, 2014 at 11:13 AM, Spencer Graves
 wrote:
> Hi, Hadley:
>
>
>
> On 4/14/2014 5:53 AM, Hadley Wickham wrote:
>>
>> For finer level detail, have a look at
>> https://github.com/hadley/cran-packages. It contains the description
>> file of every package ever uploaded to CRAN (the cache is a few months
>> out of date, but you can easily re-run)
>
>
>
>   Can that be mined to get the date of first commit to CRAN? Dates in
> description files are sometimes updated.  Example:  The "Date" in the
> Description file for the version of "fda" now on CRAN is 2013-05, but fda
> was on CRAN at least in 2006 and probably earlier.  You are listed as second
> author on that package based on work you did in by 2006.
>
>
>   Spencer
>
>
>> Hadley
>>
>> On Sun, Apr 13, 2014 at 12:59 PM, Spencer Graves
>>  wrote:
>>>
>>>What data exist on the growth of CRAN?
>>>
>>>
>>>John Fox published some data on it in 2009 ("Aspects of the Social
>>> Organization and Trajectory of the R Project", R Journal,
>>> http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf).
>>> Below
>>> please find those numbers plus some additions of mine since. If anyone
>>> else
>>> has other numbers (or more accurate numbers), I'd like to know.
>>>
>>>
>>>I plan to add a "CRANpackages" data set to the "Ecdat" package
>>> with a
>>> title, "Growth of CRAN" unless someone else provides better numbers or
>>> title.  [If it already exists on CRAN, I'd like to know. I doubt if it
>>> does,
>>> because I couldn't find it with findFn{sos} for 'number of CRAN packages'
>>> and 'growth of CRAN'.]
>>>
>>>
>>>Thanks,
>>>Spencer
>>>
>>>
>>> datepackages
>>> 2001-06-21110
>>> 2001-12-17129
>>> 2002-06-12162
>>> 2003-05-27219
>>> 2003-11-16273
>>> 2004-06-05357
>>> 2004-10-12406
>>> 2005-06-18548
>>> 2005-12-16647
>>> 2006-05-31739
>>> 2006-12-12911
>>> 2007-04-121000
>>> 2007-11-161300
>>> 2008-03-181427
>>> 2008-10-181614
>>> 2009-09-171952
>>> 2012-06-123786
>>> 2012-11-014082
>>> 2012-12-144210
>>> 2013-10-284960
>>> 2013-11-085000
>>> 2014-04-135428
>>>
>>>
>>> * NOTE:  These numbers may differ slightly from other sources. Numbers
>>> through 2009 were read from a plot in Fox's paper.  Numbers since were
>>> read
>>> at an arbitrary time during the day, Pacific time, from a mirror in the
>>> United States and could differ from those recorded by someone else using
>>> a
>>> different mirror having the same date in local time  someplace else.
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>
>
> --
> Spencer Graves, PE, PhD
> President and Chief Technology Officer
> Structure Inspection and Monitoring, Inc.
> 751 Emerson Ct.
> San José, CA 95126
> ph:  408-655-4567
> web:  www.structuremonitoring.com
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Read.table mucks up headers

2014-04-14 Thread Pavneet Arora

Hey All

I am trying to read in a small text file using read.table. 

dput(sim)
structure(list(Ã¯...X. = 1:7, Y1 = c(2.5501, 4.105, 5.4687, 7.0009, 
8.5066, 9.785, 11.5167), Y2 = c(2.5501, 4.105, 5.4687, 11.0009, 
8.5066, 9.785, 11.5167), Y3 = c(2.5501, 4.105, 5.4687, 7.0009, 
8.5066, 9.785, 15.5167)), .Names = c("Ã¯...X.", "Y1", "Y2", "Y3"
), class = "data.frame", row.names = c(NA, -7L))

But for some reason my first header comes as "?...X. ", instead of just 
"X". Can some one please tell me why? And how to fix it?

This is what I did in my code, using R-studio
> sim<-read.table("C:/00Pavneet/# MSc/# temp/sim data 1.txt",header=T)
> View(sim)


***
MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No. 93792). 
Registered in England and Wales at St. Markâs Court, Chart Way, Horsham, West 
Sussex, RH12 1XL. 

Authorised by the Prudential Regulation Authority and regulated by the 
Financial Conduct Authority and the Prudential Regulation Authority.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Programming routine comp()

2014-04-14 Thread Endy BlackEndy

 Dear R users.
I am trying to program the comp() routine in package survMisc.


I am reading the data below with d=read.table( "C:\\. .
.",fill=TRUE,header=TRUE)
Then I load the packages 'survival' and 'survMisc', library(survival),
library(survMisc)
 and I run the commands
  s=survfit(Surv(d[,2], d[,3])~d[,1],
data=d)
  comp(s)

 and I am getting the error
   Error in get(t1, loc1) : object 'd[,
2]' not found
If instead I use the commands
s=survfit(Surv(T, Status)~Group,
data=d)
   comp(s)

routine comp()  runs perfectly. However when I am programing I can't see a
way to know
in advance the variable names in order to use them.
Can anybody  give me a suggestion?

 Thanks in advance
   Endy

NB. The data must be stacked in three (3) columns before red.
They are repeated in nine (9) columns for space saving.

Group T Status Group T Status Group T Status
1 2081 0 1 55 1 2 414 1
1 1602 0 1 1 1 2 2204 1
1 1496 0 1 107 1 2 1063 1
1 1462 0 1 110 1 2 481 1
1 1433 0 1 332 1 2 105 1
1 1377 0 2 2569 0 2 641 1
1 1330 0 2 2506 0 2 390 1
1 996 0 2 2409 0 2 288 1
1 226 0 2 2218 0 2 421 1
1 1199 0 2 1857 0 2 79 1
1  0 2 1829 0 2 748 1
1 530 0 2 1562 0 2 486 1
1 1182 0 2 1470 0 2 48 1
1 1167 0 2 1363 0 2 272 1
1 418 1 2 1030 0 2 1074 1
1 383 1 2 860 0 2 381 1
1 276 1 2 1258 0 2 10 1
1 104 1 2 2246 0 2 53 1
1 609 1 2 1870 0 2 80 1
1 172 1 2 1799 0 2 35 1
1 487 1 2 1709 0 2 248 1
1 662 1 2 1674 0 2 704 1
1 194 1 2 1568 0 2 211 1
1 230 1 2 1527 0 2 219 1
1 526 1 2 1324 0 2 606 1
1 122 1 2 957 0
1 129 1 2 932 0
1 74 1 2 847 0
1 122 1 2 848 0
1 86 1 2 1850 0
1 466 1 2 1843 0
1 192 1 2 1535 0
1 109 1 2 1447 0
1 55 1 2 1384 0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Programming routine comp()

2014-04-14 Thread Endy BlackEndy

 Dear R users.
I am trying to program the comp() routine in package survMisc.


I am reading the data below with d=read.table( "C:\\. .
.",fill=TRUE,header=TRUE)
Then I load the packages 'survival' and 'survMisc', library(survival),
library(survMisc)
 and I run the commands
  s=survfit(Surv(d[,2], d[,3])~d[,1],
data=d)
  comp(s)

 and I am getting the error
   Error in get(t1, loc1) : object 'd[,
2]' not found
If instead I use the commands
s=survfit(Surv(T, Status)~Group,
data=d)
   comp(s)

routine comp()  runs perfectly. However when I am programing I can't see a
way to know
in advance the variable names in order to use them.
Can anybody  give me a suggestion?

 Thanks in advance
   Endy

NB. The data must be stacked in three (3) columns before red.
They are repeated in nine (9) columns for space saving.

Group T Status Group T Status Group T Status
1 2081 0 1 55 1 2 414 1
1 1602 0 1 1 1 2 2204 1
1 1496 0 1 107 1 2 1063 1
1 1462 0 1 110 1 2 481 1
1 1433 0 1 332 1 2 105 1
1 1377 0 2 2569 0 2 641 1
1 1330 0 2 2506 0 2 390 1
1 996 0 2 2409 0 2 288 1
1 226 0 2 2218 0 2 421 1
1 1199 0 2 1857 0 2 79 1
1  0 2 1829 0 2 748 1
1 530 0 2 1562 0 2 486 1
1 1182 0 2 1470 0 2 48 1
1 1167 0 2 1363 0 2 272 1
1 418 1 2 1030 0 2 1074 1
1 383 1 2 860 0 2 381 1
1 276 1 2 1258 0 2 10 1
1 104 1 2 2246 0 2 53 1
1 609 1 2 1870 0 2 80 1
1 172 1 2 1799 0 2 35 1
1 487 1 2 1709 0 2 248 1
1 662 1 2 1674 0 2 704 1
1 194 1 2 1568 0 2 211 1
1 230 1 2 1527 0 2 219 1
1 526 1 2 1324 0 2 606 1
1 122 1 2 957 0
1 129 1 2 932 0
1 74 1 2 847 0
1 122 1 2 848 0
1 86 1 2 1850 0
1 466 1 2 1843 0
1 192 1 2 1535 0
1 109 1 2 1447 0
1 55 1 2 1384 0

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting variables in a multivariate regression

2014-04-14 Thread peter dalgaard


On 14 Apr 2014, at 15:33 , Bert Gunter  wrote:

> Well, this is your second post on the same topic, your first having
> received no response. So you should suspect something is amiss and
> reconsider before continuing, don't you think?
> 
> 1. I, for one, was not able to make any sense of your query. You do
> not appear to understand regression, so I would suggest you spend time
> with a local statistical resource before continuing with online
> posts.If my understanding of your misunderstanding is correct, you
> need to comprehend basics. If not,apologies.
> 

The problem as such makes OK sense to me: multivariate linear model, not all 
regressors affecting all outputs. The simplest case of this is what is known as 
"seemingly unrelated regressions". The thing not known/understood by the poster 
is that such models are outside the scope of the MANOVA type models, which is 
all lm() is designed to do. The "sem" and "systemfit" packages may be of help, 
but some reading and/or consultation with someone with the relevant expertise 
may be necessary.

Peter D.

> 2. Have you read An Introduction to R (ships with R) or an online R
> tutorial of your choice? If not, do so before posting here further. We
> expect minimal efforts of posters to solve their own problems before
> posting. Again, apologies if I err.
> 
> Cheers,
> Bert
> 
> Bert Gunter
> Genentech Nonclinical Biostatistics
> (650) 467-7374
> 
> "Data is not information. Information is not knowledge. And knowledge
> is certainly not wisdom."
> H. Gilbert Welch
> 
> 
> 
> 
> On Sun, Apr 13, 2014 at 8:08 PM, Edson Tirelli  wrote:
>> I am quite new to R and I am having trouble figuring out how to select
>> variables in a multivariate linear regression in R. My google-fu also
>> did not find anything.
>> 
>> Pretend I have the following formulas:
>> 
>> P = aX + bY
>> Q = cZ + bY
>> 
>> I have a data frame with column P, Q, X, Y, Z and I need to find a, b and c.
>> 
>> If I do a simple multivariate regression:
>> 
>> result <- lm( cbind( P, Q ) ~ X + Y + Z - 1 )
>> 
>> It calculates a coefficient for "c" on P's regression and for "a" on
>> Q's regression.
>> 
>> If I calculate the regressions individually then "b" will be different
>> in each regression.
>> 
>> How can I select the variables to consider in a multivariate
>> regression? I.e., how do I tell R to ignore cZ when calculating P, and
>> ignore aX when calculating Q?
>> 
>> Thank you,
>> Edson
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] correlation with missing values.. different answers

2014-04-14 Thread peter dalgaard


On 14 Apr 2014, at 05:02 , Paul Tanger  wrote:

> Thanks, I did not realize it was deleting rows!  I was afraid to try
> "pairwise.complete.obs" because it said something about resulting in a
> matrix which is not "positive semi-definite" (and googling that term
> just confused me more).  

It means that you can get a covariance matrix that isn't one. I.e., it may 
predict that some linear combination of your variables has negative variance. 
It may turn out not to be a problem in practice, but that sort of thing tends 
to worry theoreticians.

> But I ran the dataset through JMP and got the
> same answers so I think that "pairwise.complete.obs" works for what I
> want to do.
> 

Actually, JMP 10 claims to be using the REML method, which is different from 
pairwise correlations (you can get both, so it is easy to check that they 
differ). I'm not sure we have the REML method coded up anywhere; the ML 
counterpart is in package mvnmle, and one might hope that REML isn't alll that 
much harder.

> On Sun, Apr 13, 2014 at 7:36 PM, arun  wrote:
>> 
>> 
>> 
>> Hi,
>> 
>> I think in this case, when you use "na.or.complete", all the NA rows are 
>> removed for the full dataset.
>> cor(swM[-1,1:2])
>> #  FrtltyAgrclt
>> #Frtlty 1.000 0.3920289
>> #Agrclt 0.3920289 1.000
>> 
>> cor(swM[-1,])[1:2,1:2]
>> #FrtltyAgrclt
>> #Frtlty 1.000 0.3920289
>> #Agrclt 0.3920289 1.000
>> 
>> May be you can try with "pairwise.complete.obs"
>> cor(swM, use = "pairwise.complete.obs")
>> #   Frtlty  Agrclt Exmntn  Eductn Cathlc  Infn.M
>> #Frtlty  1.000  0.39202893 -0.6531492 -0.66378886  0.4723129  0.41655603
>> #Agrclt  0.3920289  1. -0.7150561 -0.65221506  0.4152007 -0.03648427
>> #Exmntn -0.6531492 -0.71505612  1.000  0.69921153 -0.6003402 -0.11433546
>> #Eductn -0.6637889 -0.65221506  0.6992115  1. -0.1791334 -0.09932185
>> #Cathlc  0.4723129  0.41520069 -0.6003402 -0.17913339  1.000  0.18503786
>> #Infn.M  0.4165560 -0.03648427 -0.1143355 -0.09932185  0.1850379  1.
>> cor(swM[,1:2],use="pairwise.complete.obs")
>> #  FrtltyAgrclt
>> #Frtlty 1.000 0.3920289
>> #Agrclt 0.3920289 1.000
>> 
>> A.K.
>> 
>> On Sunday, April 13, 2014 9:11 PM, Paul Tanger  
>> wrote:
>> Hi,
>> I can't seem to figure out why this gives me different answers.  Probably
>> something obvious, but I thought they would be the same.
>> This is an minimal example from the help page of cor() :
>> 
>>> ## swM := "swiss" with  3 "missing"s :
>>> swM <- swiss
>>> colnames(swM) <- abbreviate(colnames(swiss), min=6)
>>> swM[1,2] <- swM[7,3] <- swM[25,5] <- NA # create 3 "missing"
>>> cor(swM, use = "na.or.complete")
>>   Frtlty  Agrclt Exmntn  Eductn Cathlc  Infn.M
>> Frtlty  1.000  0.37821953 -0.6548306 -0.67421581  0.4772298  0.38781500
>> Agrclt  0.3782195  1. -0.7127078 -0.64337782  0.4014837 -0.07168223
>> Exmntn -0.6548306 -0.71270778  1.000  0.69776906 -0.6079436 -0.10710047
>> Eductn -0.6742158 -0.64337782  0.6977691  1. -0.1701445 -0.08343279
>> Cathlc  0.4772298  0.40148365 -0.6079436 -0.17014449  1.000  0.17221594
>> Infn.M  0.3878150 -0.07168223 -0.1071005 -0.08343279  0.1722159  1.
>>> # why isn't this the same?
>>> cor(swM[,c(1:2)], use = "na.or.complete")
>>  FrtltyAgrclt
>> Frtlty 1.000 0.3920289
>> Agrclt 0.3920289 1.000
>> 
>>[[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Selecting variables in a multivariate regression

2014-04-14 Thread Bert Gunter

Well, this is your second post on the same topic, your first having
received no response. So you should suspect something is amiss and
reconsider before continuing, don't you think?

1. I, for one, was not able to make any sense of your query. You do
not appear to understand regression, so I would suggest you spend time
with a local statistical resource before continuing with online
posts.If my understanding of your misunderstanding is correct, you
need to comprehend basics. If not,apologies.

2. Have you read An Introduction to R (ships with R) or an online R
tutorial of your choice? If not, do so before posting here further. We
expect minimal efforts of posters to solve their own problems before
posting. Again, apologies if I err.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch

On Sun, Apr 13, 2014 at 8:08 PM, Edson Tirelli  wrote:
> I am quite new to R and I am having trouble figuring out how to select
> variables in a multivariate linear regression in R. My google-fu also
> did not find anything.
>
> Pretend I have the following formulas:
>
> P = aX + bY
> Q = cZ + bY
>
> I have a data frame with column P, Q, X, Y, Z and I need to find a, b and c.
>
> If I do a simple multivariate regression:
>
> result <- lm( cbind( P, Q ) ~ X + Y + Z - 1 )
>
> It calculates a coefficient for "c" on P's regression and for "a" on
> Q's regression.
>
> If I calculate the regressions individually then "b" will be different
> in each regression.
>
> How can I select the variables to consider in a multivariate
> regression? I.e., how do I tell R to ignore cZ when calculating P, and
> ignore aX when calculating Q?
>
> Thank you,
> Edson
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Growth of CRAN?

2014-04-14 Thread Hadley Wickham

For finer level detail, have a look at
https://github.com/hadley/cran-packages. It contains the description
file of every package ever uploaded to CRAN (the cache is a few months
out of date, but you can easily re-run)

Hadley

On Sun, Apr 13, 2014 at 12:59 PM, Spencer Graves
 wrote:
>   What data exist on the growth of CRAN?
>
>
>   John Fox published some data on it in 2009 ("Aspects of the Social
> Organization and Trajectory of the R Project", R Journal,
> http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf). Below
> please find those numbers plus some additions of mine since. If anyone else
> has other numbers (or more accurate numbers), I'd like to know.
>
>
>   I plan to add a "CRANpackages" data set to the "Ecdat" package with a
> title, "Growth of CRAN" unless someone else provides better numbers or
> title.  [If it already exists on CRAN, I'd like to know. I doubt if it does,
> because I couldn't find it with findFn{sos} for 'number of CRAN packages'
> and 'growth of CRAN'.]
>
>
>   Thanks,
>   Spencer
>
>
> datepackages
> 2001-06-21110
> 2001-12-17129
> 2002-06-12162
> 2003-05-27219
> 2003-11-16273
> 2004-06-05357
> 2004-10-12406
> 2005-06-18548
> 2005-12-16647
> 2006-05-31739
> 2006-12-12911
> 2007-04-121000
> 2007-11-161300
> 2008-03-181427
> 2008-10-181614
> 2009-09-171952
> 2012-06-123786
> 2012-11-014082
> 2012-12-144210
> 2013-10-284960
> 2013-11-085000
> 2014-04-135428
>
>
> * NOTE:  These numbers may differ slightly from other sources. Numbers
> through 2009 were read from a plot in Fox's paper.  Numbers since were read
> at an arbitrary time during the day, Pacific time, from a mirror in the
> United States and could differ from those recorded by someone else using a
> different mirror having the same date in local time  someplace else.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean calculations from a dframe column

2014-04-14 Thread David McPearson

On Sun, 13 Apr 2014 05:01:40 -0700 (PDT) "andre.zacha...@gmail.com"
 wrote

> Thank you very much!!
> 
..
..
>  *De :* arun kirshna [via R]
> *Envoyé :* 13 avril 2014 11:23
> *À :* andre.zacha...@gmail.com
> *Objet :* Re: mean calculations from a dframe column
> 
> Hi André,
> 
> Your codes were missing in some information. If your code looks like this:
..
.. 
> Mean <- apply(a, 2, mean, na.rm = TRUE) 
..
..
..

Andre,

Just for future reference, have a look at
?colMeans
"These functions are equivalent to use of apply with FUN = mean or FUN = sum
with appropriate margins, but are a lot faster."

Cheers,
D.


South Africas premier free email service - www.webmail.co.za 

Ensure Quality Health Care For All http://www.anc.org.za/2014/manifesto/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calling in inverted commas

2014-04-14 Thread eliza botto

Dear Arun and Frede Aakmann T©ªgersen,
I am extremely grateful that you took your time out and helped me on this mind 
boggling issue as you guys always do. I couldn't thank you guys earlier as I 
was on the move. The first thing I am doing, after turning on my PC, is to 
thank you on your help.
Thank-you once again,
Eliza

From: fr...@vestas.com
To: smartpink...@yahoo.com; fr...@vestas.com
CC: eliza_bo...@hotmail.com; r-help@r-project.org
Date: Sat, 12 Apr 2014 12:54:49 +0200
Subject: RE: [R] calling in inverted commas

Oh, I see.
Actually I forgot to check whether if could be an advantage to use the 
URLencoding () function on queryUrl. I'll check later.
Br. Frede

Sendt fra Samsung mobil

 Oprindelig meddelelse 
Fra: arun 
Dato:12/04/2014 12.45 (GMT+01:00) 
Til: Frede Aakmann T©ªgersen 
Cc: Eliza Botto ,"R. Help" 
Emne: Re: [R] calling in inverted commas 

HI,

Please ignore the previous message.  I copied your codes directly from the 
email.  For some reason, the urlPattern <- ... showed some special characters.  
I manually fixed it and now it is working.

urlPattern1<-("http://disc2.nascom.nasa.gov/daac-bin/Giovanni/tovas/Giovanni_cgi.pl?west=%s&north=%s&east=%s&south=%s¶ms=0%%7C3B42_V7&plot_type=Time+Plot&byr=1998&bmo=01&bdy=1&eyr=2007&emo=12&edy=31&begin_date=1998%%2F01%%2F01&end_date=2014%%2F01%%2F31&cbar=cdyn&cmin=&cmax=&yaxis=ydyn&ymin=&ymax=&yint=&ascres=0.25x0.25&global_cfg=tovas.global.cfg.pl&instance_id=TRMM_V7&prod_id=3B42_daily&action=ASCII+Output";)

fileDestination <- getwd() 

fileNames <- paste("precip", df2[,1], df2[,2], sep = "_")

 fileNames <- paste(fileNames, "txt", sep = ".")

 files <- file.path(fileDestination, fileNames) 

for (i in 1:nrow(df2)){

queryUrl <- sprintf(urlPattern1, df2[i, 1], df2[i, 2], df2[i, 1], df2[i, 2])

download.file(queryUrl, files[i])

}

 ## import data in first file 

precip <- read.table(files[1], skip = 4, header = TRUE, na.strings = "-.9",

sep = "", check.names = FALSE, stringsAsFactors = FALSE) 

head(precip,2)

 #  Time(year:month:day) AccRain

 #1   1998:01:01   0

 #2   1998:01:02   0 

A.K.

On , arun  wrote:

HI,

Thanks for the link.

I should have used ?sprintf().  BTW, I am not able to reproduce your results.

urlPattern <- 
"http://disc2.nascom.nasa.gov/daac-bin/Giovanni/tovas/Giovanni_cgi.pl?west=%s&north=%s&east=%s&south=%¶ms=0|3B42_V7&plot_type=Time+Plot&byr=1998&bmo=01&bdy=1&eyr=2007&emo=12&edy=31&begin_date=1998%%2F01%%2F01&end_date=2014%%2F01%%2F31&cbar=cdyn&cmin=&cmax=&yaxis=ydyn&ymin=&ymax=&yint=&ascres=0.25x0.25&global_cfg=tovas.global.cfg.pl&instance_id=TRMM_V7¢³_id=3B42_daily&action=ASCII+Output"

##

## some coordinates

df2 <- data.frame(Longitude = c(68.25, 68.75, 69.25), Latitude = c(24.75, 
25.25, 25.75))

fileDestination <- getwd()

fileNames <- paste("precip", df2[,1], df2[,2], sep = "_") 

fileNames <- paste(fileNames, "txt", sep = ".") 

files <- file.path(fileDestination, fileNames)

for (i in 1:nrow(df2)){

 queryUrl <- sprintf(urlPattern, df2[i, 1], df2[i, 2], df2[i, 1], df2[i, 
2])  download.file(queryUrl, files[i])

}

precip <- read.table(files[1], skip = 4, header = TRUE, na.strings = "-.9",

+ sep = "", check.names = FALSE, stringsAsFactors = FALSE) 

Error in read.table(files[1], skip = 4, header = TRUE, na.strings = "-.9",  
: 

  more columns than column names 

So, I checked the file "precip_68.25_24.75.txt"

Giovanni Error Message 

Giovanni Error Message Page

Error: instance configuration file  is not found. 

Error: product configuration file  is not found. 

Please send email to Help Desk: h...@daac.gs

fc.nasa.gov 

sessionInfo()

R version 3.0.2 (2013-09-25)

Platform: x86_64-unknown-linux-gnu (64-bit) 

A.K.

On Saturday, April 12, 2014 3:24 AM, Frede Aakmann T©ªgersen  
wrote:

Hi

Sigh I'm getting a headache seeing ugly formatted R code. Arun, your code is 
almost unreadable. Have a look at e.g. 
http://yihui.name/en/2010/04/formatr-farewell-to-ugly-r-code/

Now to the substantial. Why not use the sprintf() function for formatting the 
url instead of the more involved approach using several gsub and regular 
expressions that not many people (including myself) easily understand.

## here is a small example using sprintf(), see ?sprintf

## %s is format specifier for string

exampleStr <- "west=%s&north=%s&east=%s&south=%s"

sprintf(exampleStr, 1, 2, 3, 4)

## > [1] "west=1&north=2&east=3&south=4"

## since % is used in format specification then if a literal % is needed in the 
string as here

## where you have e.g. %2F01% then escape those % with a %, i.e. % becomes %% 
in the string

## I have done that in urlPattern:

urlPattern <- 
"http://disc2.nascom.nasa.gov/daac-bin/Giovanni/tovas/Giovanni_cgi.pl?west=%s&north=%s&east=%s&south=%s¶ms=0|3B42_V7&plot_type=Time+Plot&byr=1998&bmo=01&bdy=1&eyr=2007&emo=12&edy=31

53 matches

Mail list logo