Re: [R] new question

2013-03-28 Thread arun
Hi,
Try this:


Spec - function(lista,FDR_k) {

 list.new-lapply(lista,function(x) within(x,{spec- as.character(spec)}))
 split.list-split(list.new,names(lista))

 #Data needed with FDRFDR_k
 seq.mod.z-lapply(seq_along(split.list),function(i) 
lapply(split.list[[i]],function(x) 
x[x[[FDR]]FDR_k,c(Seq,Mod,z,spec)]))
 names(seq.mod.z)- names(split.list) 

 #insert colunm with the name of the folder
 folder.name-lapply(seq.mod.z,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x)
 #merge data with the same Seq, Mod and z
 library(plyr)
 library(data.table)
 merge.data- lapply(folder.name,function(x) lapply(x,function(x1) 
{x1-data.table(x1); 
x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))

 #colunm with number of spec
 count.spec-lapply(merge.data,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
count.specUnique-lapply(count.spec,function(x) lapply(x,unique))
 #count spec by group (2-columns)
 spec.group-lapply(count.specUnique,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
  #spec.group1-spec.group[lapply(spec.group,length)!=0]

 #data frame with count of spec
 res- Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),spec.group)
 res[is.na(res)] - 0
res- as.data.frame(res,stringsAsFactors=FALSE)
 print(res)
}

 Spec(ListFacGroup,0.05)
#   Seq Mod z a2 c2 c3 t2
#1    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
#2 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
#3  aAGAAGGR  1-n_acPro/ 2  1  1  0  1
#4   AAALQAK 2  1  0  1  1
#5    aAGAGPEMVR  1-n_acPro/ 2  2  2  1  2
#6 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 2  1  0  0  1
#7 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 3  1  0  0  1
#8 aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 2  0  1  0  0
#9 aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 3  1  2  2  1
#10  APGTAEK 2  0  1  0  0
#11    aSAPQQLSDEELFSQLR  1-n_acPro/ 2  1  0  0  1
#12  aVGNAVPCGAR  1-n_acPro/ 2  1  1  1  1
#13    AWEEPSSGNGTAR 2  1  1  1  1
#14  aAAAELSLLEK  1-n_acPro/ 1  1  0  0  1
#15  aAAAELSLLEK  1-n_acPro/ 2  1  1  1  1
#16 EVLGLILR 2  1  1  1  1
#17  aAAAGAAAEGEAPAEMGALLLEK  1-n_acPro/ 3  1  1  1  1
#18  aAAAPGTAVGATGSGIAGLAVYR 1-_Carbamoylation/ 3  0  0  1  0
#19  aAAAPGTAVGATGSGIAGLAVYR  1-n_acPro/ 3  1  0  0  1
#20 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK  1-n_acPro/ 3  1  0  0  1
#21  AAAkAAK 8-K_ac/ 2  0  1  0  0
#22 aAAAVGAGHGAGGPGAASSSGGAR  1-n_acPro/ 2  0  1  1  0
#23 aAAAVGAGHGAGGPGAASSSGGAR  1-n_acPro/ 3  0  0  1  0
#24 aAADGDDSLYPIAVLIDELR  1-n_acPro/ 2  0  0  1  0


Regarding the 2nd question, I am a bit busy now.  Will try it later.
A.K.




 From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, March 28, 2013 9:43 AM
Subject: Re: new question
 

I don't remove duplicated, but write only one time. If I haven't unique I 
have the same row a lot of times, but with unique we remove all. I need this 
row write only one time. 
 
without unique the output is 
 
1    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
2    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
3    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
4    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
5    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
6 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
7 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
8 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
9 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
10    aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
11    aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
12 aAGAAGGR  1-n_acPro/ 2  1  1  0  1
13  AAALQAK 2  1  0  1  1
14   aAGAGPEMVR  1-n_acPro/ 2  2  2  1  2
15    aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 2  1  0  0  1
16    aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 3  1  0  0  1
17    aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 2  0  1  0  0
18    aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 3  1  2  

Re: [R] new question

2013-03-28 Thread arun
Hi,
The function outputs the unique rows and also chisq test on frequency ( by row).


Spec - function(lista,FDR_k) {

 list.new-lapply(lista,function(x) within(x,{spec- as.character(spec)}))
 split.list-split(list.new,names(lista))

 #Data needed with FDRFDR_k
 seq.mod.z-lapply(seq_along(split.list),function(i) 
lapply(split.list[[i]],function(x) 
x[x[[FDR]]FDR_k,c(Seq,Mod,z,spec)]))
 names(seq.mod.z)- names(split.list) 

 #insert colunm with the name of the folder
 folder.name-lapply(seq.mod.z,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x)
 #merge data with the same Seq, Mod and z
 library(plyr)
 library(data.table)
 merge.data- lapply(folder.name,function(x) lapply(x,function(x1) 
{x1-data.table(x1); 
x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))

 #colunm with number of spec
 count.spec-lapply(merge.data,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
count.specUnique-lapply(count.spec,function(x) lapply(x,unique))
 #count spec by group (2-columns)
 spec.group-lapply(count.specUnique,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
  #spec.group1-spec.group[lapply(spec.group,length)!=0]

 #data frame with count of spec
 res- Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),spec.group)
 res[is.na(res)] - 0
res- as.data.frame(res,stringsAsFactors=FALSE)
 #print(res)
Chisq1test_Count-do.call(cbind,lapply(as.data.frame(combn(names(res)[4:ncol(res)],2),stringsAsFactors=FALSE),function(x)
 
{x1-data.frame(apply(cbind(res[x[1]],res[x[2]]),1,function(y){ifelse(sum(y)==0,
 NA, chisq.test(y)$p.value)}));colnames(x1)- paste0(Count_,x[1],x[2]);x1}))
#print(Chisq1test_Count)
res1- cbind(res,Chisq1test_Count)
res1
}

ListFacGroup-ReadDir(FacGroup)
Spec(ListFacGroup,0.05)
 head(Spec(ListFacGroup,0.05))
#    Seq Mod z a2 c2 c3 t2 Count_a2c2
#1    aAATATAGPR  1-n_acPro/ 2  5  0  0  1 0.02534732
#2 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1 0.01430588
#3  aAGAAGGR  1-n_acPro/ 2  1  1  0  1 1.
#4   AAALQAK 2  1  0  1  1 0.31731051
#5    aAGAGPEMVR  1-n_acPro/ 2  2  2  1  2 1.
#6 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 2  1  0  0  1 0.31731051
 # Count_a2c3 Count_a2t2 Count_c2c3 Count_c2t2 Count_c3t2
#1 0.02534732 0.10247043 NA  0.3173105  0.3173105
#2 0.01430588 0.05878172 NA  0.3173105  0.3173105
#3 0.31731051 1.  0.3173105  1.000  0.3173105
#4 1. 1.  0.3173105  0.3173105  1.000
#5 0.56370286 1.  0.5637029  1.000  0.5637029
#6 0.31731051 1. NA  0.3173105  0.3173105

A.K.



 From: arun smartpink...@yahoo.com
To: Vera Costa veracosta...@gmail.com 
Cc: R help r-help@r-project.org 
Sent: Thursday, March 28, 2013 10:18 AM
Subject: Re: [R] new question
 
Hi,
Try this:


Spec - function(lista,FDR_k) {

 list.new-lapply(lista,function(x) within(x,{spec- as.character(spec)}))
 split.list-split(list.new,names(lista))

 #Data needed with FDRFDR_k
 seq.mod.z-lapply(seq_along(split.list),function(i) 
lapply(split.list[[i]],function(x) 
x[x[[FDR]]FDR_k,c(Seq,Mod,z,spec)]))
 names(seq.mod.z)- names(split.list) 

 #insert colunm with the name of the folder
 folder.name-lapply(seq.mod.z,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x)
 #merge data with the same Seq, Mod and z
 library(plyr)
 library(data.table)
 merge.data- lapply(folder.name,function(x) lapply(x,function(x1) 
{x1-data.table(x1); 
x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))

 #colunm with number of spec
 count.spec-lapply(merge.data,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
count.specUnique-lapply(count.spec,function(x) lapply(x,unique))
 #count spec by group (2-columns)
 spec.group-lapply(count.specUnique,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
  #spec.group1-spec.group[lapply(spec.group,length)!=0]

 #data frame with count of spec
 res- Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),spec.group)
 res[is.na(res)] - 0
res- as.data.frame(res,stringsAsFactors=FALSE)
 print(res)
}

 Spec(ListFacGroup,0.05)
#   Seq Mod z a2 c2 c3 t2
#1    aAATATAGPR  1-n_acPro/ 2  5  0  0  1
#2 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1
#3  aAGAAGGR  1-n_acPro/ 2  1  1  0  1
#4   AAALQAK 2  1  0  1  1
#5

Re: [R] new question

2013-03-28 Thread arun


Hi,
You also mentioned about separating the significant from the non-significant.

If you replace:
Chisq1test_Count-do.call(cbind,lapply(as.data.frame(combn(names(res)[4:ncol(res)],2),stringsAsFactors=FALSE),function(x)
 
{x1-data.frame(apply(cbind(res[x[1]],res[x[2]]),1,function(y){ifelse(sum(y)==0,
 NA, chisq.test(y)$p.value)}));colnames(x1)- paste0(Count_,x[1],x[2]);x1}))
res1- cbind(res,Chisq1test_Count)

with
Chisqtest_CountNew-do.call(cbind,lapply(as.data.frame(combn(names(res)[4:ncol(res)],2),stringsAsFactors=FALSE),function(x)
 
{x1-data.frame(apply(cbind(res[x[1]],res[x[2]]),1,function(y){ifelse(sum(y)==0,
 NA, chisq.test(y)$p.value)}));colnames(x1)- 
paste0(Count_,x[1],x[2]);x2-within(x1,{Flag-ifelse(x1[,1]0.05,S,NS)}); 
colnames(x2)[2]-paste0(colnames(x2)[1],_Flag);x2}))
res1- cbind(res,Chisqtest_CountNew)

in the Spec(),
head(Spec(ListFacGroup,0.05),2)
# Seq    Mod z a2 c2 c3 t2 V1.Count_a2c2
#1 aAATATAGPR 1-n_acPro/ 2  5  0  0  1    0.02534732
#2  aAAASSPVGVGQR 1-n_acPro/ 2  6  0  0  1    0.01430588
#  V1.Count_a2c2_Flag V2.Count_a2c3 V2.Count_a2c3_Flag V3.Count_a2t2
#1  S    0.02534732  S    0.10247043
#2  S    0.01430588  S    0.05878172
#  V3.Count_a2t2_Flag V4.Count_c2c3 V4.Count_c2c3_Flag V5.Count_c2t2
#1 NS    NA   NA 0.3173105
#2 NS    NA   NA 0.3173105
 # V5.Count_c2t2_Flag V6.Count_c3t2 V6.Count_c3t2_Flag
#1 NS 0.3173105 NS
#2 NS 0.3173105 NS


A.K.





- Original Message -
From: arun smartpink...@yahoo.com
To: Vera Costa veracosta...@gmail.com
Cc: R help r-help@r-project.org
Sent: Thursday, March 28, 2013 2:28 PM
Subject: Re: [R] new question

Hi,
The function outputs the unique rows and also chisq test on frequency ( by row).


Spec - function(lista,FDR_k) {

 list.new-lapply(lista,function(x) within(x,{spec- as.character(spec)}))
 split.list-split(list.new,names(lista))

 #Data needed with FDRFDR_k
 seq.mod.z-lapply(seq_along(split.list),function(i) 
lapply(split.list[[i]],function(x) 
x[x[[FDR]]FDR_k,c(Seq,Mod,z,spec)]))
 names(seq.mod.z)- names(split.list) 

 #insert colunm with the name of the folder
 folder.name-lapply(seq.mod.z,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x)
 #merge data with the same Seq, Mod and z
 library(plyr)
 library(data.table)
 merge.data- lapply(folder.name,function(x) lapply(x,function(x1) 
{x1-data.table(x1); 
x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))

 #colunm with number of spec
 count.spec-lapply(merge.data,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
count.specUnique-lapply(count.spec,function(x) lapply(x,unique))
 #count spec by group (2-columns)
 spec.group-lapply(count.specUnique,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
  #spec.group1-spec.group[lapply(spec.group,length)!=0]

 #data frame with count of spec
 res- Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),spec.group)
 res[is.na(res)] - 0
res- as.data.frame(res,stringsAsFactors=FALSE)
 #print(res)
Chisq1test_Count-do.call(cbind,lapply(as.data.frame(combn(names(res)[4:ncol(res)],2),stringsAsFactors=FALSE),function(x)
 
{x1-data.frame(apply(cbind(res[x[1]],res[x[2]]),1,function(y){ifelse(sum(y)==0,
 NA, chisq.test(y)$p.value)}));colnames(x1)- paste0(Count_,x[1],x[2]);x1}))
#print(Chisq1test_Count)
res1- cbind(res,Chisq1test_Count)
res1
}

ListFacGroup-ReadDir(FacGroup)
Spec(ListFacGroup,0.05)
 head(Spec(ListFacGroup,0.05))
#    Seq Mod z a2 c2 c3 t2 Count_a2c2
#1    aAATATAGPR  1-n_acPro/ 2  5  0  0  1 0.02534732
#2 aAAASSPVGVGQR  1-n_acPro/ 2  6  0  0  1 0.01430588
#3  aAGAAGGR  1-n_acPro/ 2  1  1  0  1 1.
#4   AAALQAK 2  1  0  1  1 0.31731051
#5    aAGAGPEMVR  1-n_acPro/ 2  2  2  1  2 1.
#6 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 2  1  0  0  1 0.31731051
 # Count_a2c3 Count_a2t2 Count_c2c3 Count_c2t2 Count_c3t2
#1 0.02534732 0.10247043 NA  0.3173105  0.3173105
#2 0.01430588 0.05878172 NA  0.3173105  0.3173105
#3 0.31731051 1.  0.3173105  1.000  0.3173105
#4 1. 1.  0.3173105  0.3173105  1.000
#5 0.56370286 1.  0.5637029  1.000  0.5637029
#6 0.31731051 1. NA  0.3173105  0.3173105

A.K.



From: arun smartpink...@yahoo.com
To: Vera Costa veracosta...@gmail.com 
Cc: R help r-help@r-project.org 
Sent: Thursday, March 28, 2013 10:18 AM
Subject: Re: [R] new question

Hi,
Try

Re: [R] new question

2013-03-27 Thread arun
Hi,
Regarding the first question:
ListFacGroup-  lapply(ListFacGroup,unique)
 Spec(ListFacGroup,0.05)
#
 #   Seq Mod z a2 c2 c3 t2
#1    aAATATAGPR  1-n_acPro/ 2  1  0  0  1
#2 aAAASSPVGVGQR  1-n_acPro/ 2  1  0  0  1
#3  aAGAAGGR  1-n_acPro/ 2  1  1  0  1
#4   AAALQAK 2  1  0  1  1
#5    aAGAGPEMVR  1-n_acPro/ 2  2  2  1  2
#6 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 2  1  0  0  1
#7 aEQQQFYLLLGNLLSPDNVVR 1-_Carbamoylation/ 3  1  0  0  1
#8 aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 2  0  1  0  0
#9 aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 3  1  2  2  1
#10  APGTAEK 2  0  1  0  0
#11    aSAPQQLSDEELFSQLR  1-n_acPro/ 2  1  0  0  1
#12  aVGNAVPCGAR  1-n_acPro/ 2  1  1  1  1
#13    AWEEPSSGNGTAR 2  1  1  1  1
#14  aAAAELSLLEK  1-n_acPro/ 1  1  0  0  1
#15  aAAAELSLLEK  1-n_acPro/ 2  1  1  1  1
#16 EVLGLILR 2  1  1  1  1
#17  aAAAGAAAEGEAPAEMGALLLEK  1-n_acPro/ 3  1  1  1  1
#18  aAAAPGTAVGATGSGIAGLAVYR 1-_Carbamoylation/ 3  0  0  1  0
#19  aAAAPGTAVGATGSGIAGLAVYR  1-n_acPro/ 3  1  0  0  1
#20 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK  1-n_acPro/ 3  1  0  0  1
#21  AAAkAAK 8-K_ac/ 2  0  1  0  0
#22 aAAAVGAGHGAGGPGAASSSGGAR  1-n_acPro/ 2  0  1  1  0
#23 aAAAVGAGHGAGGPGAASSSGGAR  1-n_acPro/ 3  0  0  1  0
#24 aAADGDDSLYPIAVLIDELR  1-n_acPro/ 2  0  0  1  0


2nd question: I am not sure I understand it correctly.
A.K.




 From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Wednesday, March 27, 2013 7:07 PM
Subject: Re: new question
 

Hi.
With the data in attach,the results has duplicated. I think it's better only 
one rows with the information. 
The 2nd question,about t test,it's what you did,but for frequencies directly. 
You understand? Thank you very much

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new question

2013-03-26 Thread arun


Hi,
Try this: (Used the old data folder)

Compares the spec counts of sub directory with each other.  

directory- /home/arunksa111/dados
GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()[file.info(dir())$isdir]
    direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 direct-lapply(direct,function(x) paste(directory,/,x,sep=))
    lista-unlist(direct)
 output- list(filelist1,lista)
 return(output)
    }

 
ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
}

 file.list.names-GetFileList(directory,23) [[1]]
 lista-GetFileList(directory,23) [[2]]

FacGroup-c(1,1,0,2,2,0,0)
ListFacGroup-ReadDir(FacGroup)
lst1-lapply(ListFacGroup,function(x) {x-within(x,{spec- 
as.character(spec)}); x1- x[x$FDR0.01,c(Seq,Mod,z,spec)]; 
x1$counts-sapply(x1$spec,function(x2) length(unlist(strsplit(x2,,;x1})
lst2-lapply(as.data.frame(combn(names(lst1),2),stringsAsFactors=FALSE),function(x)
 { 
x1-merge(lst1[[x[1]]],lst1[[x[2]]],by=c(Seq,Mod,z));names(x1)[c(5,7)]- 
paste0(counts_,c(x[1],x[2]));x1[2,5]-8;x2-data.frame(apply(cbind(x1[,5],x1[,7]),1,function(y)
 {chisq.test(y)$p.value})); colnames(x2)- paste0(Counts_,x[1],x[2]);x2})

In the above code:, I replaced a count datapoint to make it significant for 
the purpose of separating the significant from others ( x1[2,5]- 8). 

 lst3-lapply(lst2,function(x) 
{x1-list(x[!x[,1]0.05,],x[x[,1]0.05,]);names(x1)-rep(colnames(x),each=2);x1})
 lst3[[5]]
#$Counts_a2c3
#[1] 1 1 1 1 1

#$Counts_a2c3
#[1] 0.01963066


  A.K.





From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, March 26, 2013 4:26 PM
Subject: Re: new question (if you could help...)


But you could do the code of t test and chisq.test or not? Thank you
No dia 26 de Mar de 2013 20:16, arun smartpink...@yahoo.com escreveu:

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new question

2013-03-21 Thread arun
Sorry, I am not sure I understand your question.  

You have 5 boxplots(if I remember correctly) on a single page and these 5 
boxplots have titles a1,a2,c1,c2,t1 (or something like that). 

Could you try:
par(mfcol(c(3,2)) and see if that helps. (not tested)
A.K.







From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, March 21, 2013 11:24 AM
Subject: Re: new question


Hi.

Thank you your help and sorry only answer now. 

Ok, the boxplots is ok. But I need too by group... on par(mfrow(c(2,2))), I can 
have par(mfrow(c(3,2)), for example (or more) and have a boxplot for the group. 
You can help me to group them?

About the other function I need to do other things, but now I need to think 
what to do...

Thank you





2013/3/18 arun smartpink...@yahoo.com




 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))

}
 z.boxplot(ListFacGroup) #prints new.list

If you want to turn off that:

 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
#print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))


}
 z.boxplot(ListFacGroup)
A.K.






From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Monday, March 18, 2013 1:59 PM
Subject: Re: new question



For example, if I run you code without pdf and dev.off I have what I 
want

directory- C:/Users/Vera Costa/Desktop/dados.lixo
 #modified the function
GetFileList - function(directory,number){
  setwd(directory)
  filelist1-dir()

    lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = TRUE, recursive = TRUE)
  output- list(filelist1,lista)
  return(output)
 }
file.list.names-GetFileList(directory,23)[[1]]
lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)
ReadDir-function(FacGroup){
  list.new-lista[FacGroup!=0]
  read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = 
\t))
  names(read.list)-file.list.names[FacGroup!=0]
  return (read.list)
 }
ListFacGroup-ReadDir(FacGroup)
ListFacGroup
 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
 print(new.list)
 #pdf(VeraBP.pdf)
 par(mfrow=c(2,2))
 lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
 #dev.off()
 }
 z.boxplot(ListFacGroup)





But I have the results too (I don't need it)


[[1]]
[[1]]$a2
[[1]]$a2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[1]]$a2$n
[1] 8 2
[[1]]$a2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[1]]$a2$out
[1] 0.00494012
[[1]]$a2$group
[1] 1
[[1]]$a2$names
[1] 2 3

[[2]]
[[2]]$c2
[[2]]$c2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[2]]$c2$n
[1] 8 2
[[2]]$c2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[2]]$c2$out
[1] 0.00494012
[[2]]$c2$group
[1] 1
[[2]]$c2$names
[1] 2 3

[[3]]
[[3]]$c3
[[3]]$c3$stats
    [,1] [,2]
[1,] 0.0 0.00
[2,] 0.0 0.00
[3,] 0.0 0.00
[4,] 0.002226409 0.0002086594
[5,] 0.002226409 0.0004173187
[[3]]$c3$n
[1] 6 4
[[3]]$c3$conf
 [,1]  [,2]
[1,] -0.001436105 -0.0001648409
[2,]  0.001436105  0.0001648409
[[3]]$c3$out
[1] 0.00560348
[[3]]$c3$group
[1] 1
[[3]]$c3$names
[1] 2 3

[[4]]
[[4]]$t2
[[4]]$t2$stats
 [,1] [,2] [,3]
[1,]    0 0.00    0
[2,]    0 0.00    0
[3,]    0 0.0002908668    0
[4,]    0 0.0025929827    0
[5,]    0 0.005251    0
[[4]]$t2$n
[1]  1 10  5
[[4]]$t2$conf
 [,1] [,2] [,3]
[1,]    0 -0.001004691    0
[2,]    0  0.001586424    0
[[4]]$t2$out
[1] 0.0092051934 0.0007174888
[[4]]$t2$group
[1] 2 3
[[4]]$t2$names
[1] 1 2 3
  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new question

2013-03-21 Thread arun
Hi,
Try this:


directory- /home/arunksa111/dados
GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()[file.info(dir())$isdir]
    direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 direct-lapply(direct,function(x) paste(directory,/,x,sep=))
    lista-unlist(direct)
 output- list(filelist1,lista)
 return(output)
    }

 file.list.names-GetFileList(directory,23)[[1]]
 lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)


ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
}
ListFacGroup-ReadDir(FacGroup)

z.boxplotgroup- function(lst){
lst1- lapply(lst,function(x) x[x$FDR0.01,])
library(plyr)
new.list-lapply(split(lst1,gsub(\\d+,,names(lst1))),function(x) 
join_all(lapply(x,function(x) x),type=full))
par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
}

z.boxplotgroup(ListFacGroup)

z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
#print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
}
 z.boxplot(ListFacGroup)


 pdf(Veraboxplot.pdf)
 z.boxplot(ListFacGroup)
 z.boxplotgroup(ListFacGroup)
Joining by: Seq, Mod, z, score, FDR, Count, E, C, pos, spec, Pro
 dev.off()
A.K.



From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, March 21, 2013 11:24 AM
Subject: Re: new question


Hi.

Thank you your help and sorry only answer now. 

Ok, the boxplots is ok. But I need too by group... on par(mfrow(c(2,2))), I can 
have par(mfrow(c(3,2)), for example (or more) and have a boxplot for the group. 
You can help me to group them?

About the other function I need to do other things, but now I need to think 
what to do...

Thank you





2013/3/18 arun smartpink...@yahoo.com




 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))

}
 z.boxplot(ListFacGroup) #prints new.list

If you want to turn off that:

 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
#print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))


}
 z.boxplot(ListFacGroup)
A.K.






From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Monday, March 18, 2013 1:59 PM
Subject: Re: new question



For example, if I run you code without pdf and dev.off I have what I 
want

directory- C:/Users/Vera Costa/Desktop/dados.lixo
 #modified the function
GetFileList - function(directory,number){
  setwd(directory)
  filelist1-dir()

    lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = TRUE, recursive = TRUE)
  output- list(filelist1,lista)
  return(output)
 }
file.list.names-GetFileList(directory,23)[[1]]
lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)
ReadDir-function(FacGroup){
  list.new-lista[FacGroup!=0]
  read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = 
\t))
  names(read.list)-file.list.names[FacGroup!=0]
  return (read.list)
 }
ListFacGroup-ReadDir(FacGroup)
ListFacGroup
 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
 print(new.list)
 #pdf(VeraBP.pdf)
 par(mfrow=c(2,2))
 lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
 #dev.off()
 }
 z.boxplot(ListFacGroup)





But I have the results too (I don't need it)


[[1]]
[[1]]$a2
[[1]]$a2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[1]]$a2$n
[1] 8 2
[[1]]$a2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[1]]$a2$out
[1] 0.00494012
[[1]]$a2$group
[1] 1
[[1]]$a2$names
[1] 2 3

[[2]]
[[2]]$c2
[[2]]$c2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[2]]$c2$n
[1] 8 2
[[2]]$c2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[2]]$c2$out
[1] 0.00494012
[[2]]$c2$group
[1] 1
[[2]]$c2$names
[1] 2 3

[[3]]
[[3]]$c3
[[3]]$c3$stats
    [,1] [,2]
[1,] 0.0 0.00
[2,] 0.0 0.00
[3,] 0.0 0.00
[4,] 0.002226409 

Re: [R] new question

2013-03-19 Thread arun


I deleted the 't' subfolders from the dados folder.

If you don't have 't' folders, wouldn't it be better to use:
directory- /home/arunksa111/dados


FacGroup-c(0,1,0,2,2,0,0)
#instead of

#FacGroup-c(0,1,0,2,2,0,3)
FacGroup-c(0,1,0,2,2,0,3)
 lista[FacGroup!=0]
#[1] /home/arunksa111/dados/a2/MSMS_23PepInfo.txt
#[2] /home/arunksa111/dados/c2/MSMS_23PepInfo.txt
#[3] /home/arunksa111/dados/c3/MSMS_23PepInfo.txt
#[4] NA 

FacGroup-c(0,1,0,2,2,0,0)
 lista[FacGroup!=0]
#[1] /home/arunksa111/dados/a2/MSMS_23PepInfo.txt
#[2] /home/arunksa111/dados/c2/MSMS_23PepInfo.txt
#[3] /home/arunksa111/dados/c3/MSMS_23PepInfo.txt


Couldn't reproduce the error you mentioned.

Spec(ListFacGroup,0.05)
#   Seq Mod z a2 c2 c3
#1 aAGAAGGR  1-n_acPro/ 2  1  1  0
#2  AAAkAAK 8-K_ac/ 2  1  1  0
#3   aAGAGPEMVR  1-n_acPro/ 2  2  2  1
#4    aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 2  1  1  0
#5    aEQQQFYLLLGNLLSPDNVVR  1-n_acPro/ 3  2  2  2
#6  APGTAEK 2  1  1  0
#7  aVGNAVPCGAR  1-n_acPro/ 2  1  1  1
___
-
A.K.





From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, March 19, 2013 10:14 AM
Subject: Re: new question


Hi again. 

you sent me a code ( e few time ago)

GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()[file.info(dir())$isdir]
    direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 direct-lapply(direct,function(x) paste(directory,/,x,sep=))
    lista-unlist(direct)
 output- list(filelist1,lista)
 return(output)
    }

ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
} 

directory-C:/Users/Vera Costa/Desktop/dados.lixo
 file.list.names-GetFileList(directory,23) [[1]]
 lista-GetFileList(directory,23) [[2]]
FacGroup-c(0,1,0,2,2,0,3)
#FacGroup-c(0,1,1,1,0,2,2,2)
ListFacGroup-ReadDir(FacGroup)
#zPValues(ListFacGroup,FacGroup) 

 Spec - function(lista,FDR_k) {

 list.new-lapply(lista,function(x) within(x,{spec- as.character(spec)}))
 split.list-split(list.new,names(lista))

 #Data needed with FDRFDR_k
 seq.mod.z-lapply(seq_along(split.list),function(i) 
lapply(split.list[[i]],function(x) 
x[x[[FDR]]FDR_k,c(Seq,Mod,z,spec)]))
 names(seq.mod.z)- names(split.list) 

 #insert colunm with the name of the folder 
 folder.name-lapply(seq.mod.z,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x)
 #merge data with the same Seq, Mod and z
 library(plyr)
 library(data.table)
 merge.data- lapply(folder.name,function(x) lapply(x,function(x1) 
{x1-data.table(x1); 
x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))

 #colunm with number of spec
 count.spec-lapply(merge.data,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))

 #count spec by group (2-columns)
 spec.group-lapply(count.spec,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
  #spec.group1-spec.group[lapply(spec.group,length)!=0]

 #data frame with count of spec
 res- Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),spec.group)
 res[is.na(res)] - 0
 print(res)
 
resNew-do.call(cbind,lapply(split(names(res)[4:ncol(res)],gsub([0-9],,names(res)[4:ncol(res)])),
 function(i) {x-if(ncol(res[i])1) rowSums(res[i]) else res[i]; 
colnames(x)-NULL;x}))
 print(head(resNew),10)
 print(vera)
 print(names(resNew))
 indx-combn(names(resNew),2)
  t.test.p.value - function(...) {
  obj-try(t.test(...), silent=TRUE)
  if (is(obj, try-error)) return(NA) else return(obj$p.value)
  }

 resPval-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) 
{x-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); 
colnames(x)-paste(Pvalue,paste(indx[,i],collapse=),sep=_);x}))
 print(resPval)
 resF-cbind(res,resPval)
 print(resF)

Spec(ListFacGroup,0.05) 



And it is ok for the file list that you have. But if I apply to one file list 
with only a and c (for example) I have an error Error in combn(names(resNew), 
2) : n  m How can I adapt this for all cases? 



2013/3/18 arun smartpink...@yahoo.com

I hope this solved the problem...




- Original Message -
From: arun smartpink...@yahoo.com
To: Vera Costa veracosta...@gmail.com
Cc: R help r-help@r-project.org

Sent: Monday, March 18, 2013 2:27 PM
Subject: Re: new question




 z.boxplot- function(lst){
 new.list-  

Re: [R] new question

2013-03-18 Thread arun



 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))

}
 z.boxplot(ListFacGroup) #prints new.list

If you want to turn off that:
 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
#print(new.list)
  par(mfrow=c(2,2))
b1-lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))

}
 z.boxplot(ListFacGroup)
A.K.






From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Monday, March 18, 2013 1:59 PM
Subject: Re: new question


For example, if I run you code without pdf and dev.off I have what I 
want 

directory- C:/Users/Vera Costa/Desktop/dados.lixo
 #modified the function
GetFileList - function(directory,number){
  setwd(directory)
  filelist1-dir()

    lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = TRUE, recursive = TRUE)
  output- list(filelist1,lista)
  return(output)
 }
file.list.names-GetFileList(directory,23)[[1]]
lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)
ReadDir-function(FacGroup){
  list.new-lista[FacGroup!=0]
  read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
  names(read.list)-file.list.names[FacGroup!=0]
  return (read.list)
 }
ListFacGroup-ReadDir(FacGroup)
ListFacGroup
 z.boxplot- function(lst){
 new.list-  lapply(lst,function(x) x[x$FDR0.01,])
 print(new.list)
 #pdf(VeraBP.pdf)
 par(mfrow=c(2,2))
 lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
 #dev.off()
 }
 z.boxplot(ListFacGroup)





But I have the results too (I don't need it)


[[1]]
[[1]]$a2
[[1]]$a2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[1]]$a2$n
[1] 8 2
[[1]]$a2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[1]]$a2$out
[1] 0.00494012
[[1]]$a2$group
[1] 1
[[1]]$a2$names
[1] 2 3

[[2]]
[[2]]$c2
[[2]]$c2$stats
 [,1] [,2]
[1,] 0.00 0.00
[2,] 0.00 0.00
[3,] 0.0001355197 0.0002175095
[4,] 0.0010588777 0.0004350190
[5,] 0.0016571381 0.0004350190
[[2]]$c2$n
[1] 8 2
[[2]]$c2$conf
  [,1]  [,2]
[1,] -0.0004559846 -0.0002685062
[2,]  0.0007270240  0.0007035253
[[2]]$c2$out
[1] 0.00494012
[[2]]$c2$group
[1] 1
[[2]]$c2$names
[1] 2 3

[[3]]
[[3]]$c3
[[3]]$c3$stats
    [,1] [,2]
[1,] 0.0 0.00
[2,] 0.0 0.00
[3,] 0.0 0.00
[4,] 0.002226409 0.0002086594
[5,] 0.002226409 0.0004173187
[[3]]$c3$n
[1] 6 4
[[3]]$c3$conf
 [,1]  [,2]
[1,] -0.001436105 -0.0001648409
[2,]  0.001436105  0.0001648409
[[3]]$c3$out
[1] 0.00560348
[[3]]$c3$group
[1] 1
[[3]]$c3$names
[1] 2 3

[[4]]
[[4]]$t2
[[4]]$t2$stats
 [,1] [,2] [,3]
[1,]    0 0.00    0
[2,]    0 0.00    0
[3,]    0 0.0002908668    0
[4,]    0 0.0025929827    0
[5,]    0 0.005251    0
[[4]]$t2$n
[1]  1 10  5
[[4]]$t2$conf
 [,1] [,2] [,3]
[1,]    0 -0.001004691    0
[2,]    0  0.001586424    0
[[4]]$t2$out
[1] 0.0092051934 0.0007174888
[[4]]$t2$group
[1] 2 3
[[4]]$t2$names
[1] 1 2 3

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] new question

2013-03-15 Thread arun
Hi,
Try this:

directory- /home/arunksa111/dados 
#modified the function
GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()
    lista-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = TRUE, recursive = TRUE)
 output- list(filelist1,lista)
 return(output)
    }

 file.list.names-GetFileList(directory,23)[[1]]
 lista-GetFileList(directory,23)[[2]]
FacGroup-c(0,1,0,2,2,0,3)


ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
} 
ListFacGroup-ReadDir(FacGroup)

z.boxplot- function(lst){
new.list-  lapply(lst,function(x) x[x$FDR0.01,])
pdf(VeraBP.pdf)
lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab=Charge,ylab=FDR,main=x)))
dev.off()
}
z.boxplot(ListFacGroup)

A.K.


From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Friday, March 15, 2013 2:08 PM
Subject: Re: new question


Sorry, you could give me a small new help?

Using the same data, I need a boxplot by groups.

I write he the functions I'm using. The last (z.boxplot is what I need, the 
other is ok). Thank you one more time.

GetFileList - function(directory,number){
 setwd(directory)
 filelist1-dir()[file.info(dir())$isdir]
    direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
 direct-lapply(direct,function(x) paste(directory,/,x,sep=))
    lista-unlist(direct)
 output- list(filelist1,lista)
 return(output)
    }

ReadDir-function(FacGroup){
 list.new-lista[FacGroup!=0]
 read.list-lapply(list.new, function(x) read.table(x,header=TRUE, sep = \t))
 names(read.list)-file.list.names[FacGroup!=0]
 return (read.list)
} 

directory-C:/Users/Vera Costa/Desktop/dados.lixo
 file.list.names-GetFileList(directory,23) [[1]]
 lista-GetFileList(directory,23) [[2]]
FacGroup-c(0,1,0,2,2,0,3)
ListFacGroup-ReadDir(FacGroup)
#zPValues(ListFacGroup,FacGroup) 

z.boxplot - function(lista) {
#I need eliminate all data with FDR0.01
new.list-lista[FDR0.01]
#boxplots split by groups 
boxplot(FDR ~ z, data = dct1,  xlab = Charge, ylab = 
FDR,main=(paste(t,i)))
 }
z.boxplot(ListFacGroup)




2013/3/13 Vera Costa veracosta...@gmail.com

No problem!
Sorry my questions.



2013/3/13 arun smartpink...@yahoo.com

As I mentioned earlier, I don't find it useful to do anova on that kind of 
data.  Previously, I tried with chisq.test also.  It gave warnings() and then 
you responded that it is not correct.  I would suggest you to dput an example 
dataset of the specific columns  that you want to compare (possibly by row) 
and post in the R-help list.  If you get any reply, then you can implement it 
on your whole list of files.  Sorry, today, I am busy.    








From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 13, 2013 9:43 AM

Subject: Re: new question


Ok. Thank you.
Could you help me to apply this?



2013/3/13 arun smartpink...@yahoo.com

you are comparing one datapoint to another.  It doesn't make sense.  For 
anova, you need replications to calculate df.  may be you could try 
chisq.test.








From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Wednesday, March 13, 2013 8:56 AM

Subject: Re: new question


I agree with you.

I write this tests because I need to compare with some test. I agree is not 
very correct, but what is bioconductor?I need to eliminate some data (rows) 
not very significant based in some statistics. What about your idea? How can 
I do this?



2013/3/13 arun smartpink...@yahoo.com

Ok.



I need a t test (it's in this function). But I need a chisq.test corrected 
and a Anova with data in attach.

What do you mean by this?

Though, I calculated the t test based on comparing a single value against 
another for each row, I don't think it makes sense statistically.  Here, 
you are estimating the mean by just one value, which then is the mean value 
and comparing it with another value.  It doesn't make much sense.  I think 
in bioconductor there are some packages which do this kind of comparison (I 
don't remember the names).  Also, I am not sure what kind of inference you 
want from chisquare test.  Also, from anova test (?using just 2 datapoints) 
(if the comparison is rowwise).





From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Tuesday, March 12, 2013 6:04 PM

Subject: Re: new question


Ok. It isn't the last code...
You sent me this code

directory- /home/arunksa111/data.new
#first function
filelist-function(directory,number,list1){
setwd(directory)
filelist1-dir(directory)

direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)


Re: [R] new question

2013-03-07 Thread arun
Hi,


directory- /home/arunksa111/dados #renamed directory to dados
filelist-function(directory,number,list1){
setwd(directory)
filelist1-dir(directory)
direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
list1-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
\t,stringsAsFactors=FALSE))
names(list1)-filelist1
list2- list(filelist1,list1)
return(list2)
}
foldernames1-filelist(directory,23,list1)[[1]]
foldernames1
#[1] a1 a2 c1 c2 c3 t1 t2
lista-filelist(directory,23,list1)[[2]] #lista output 

#If you look at the 
 lapply(lista,function(x) sapply(x,class)) #some spec were integer, and some 
were character
#do this
 listaNew-lapply(lista,function(x) within(x,{spec- as.character(spec)}))

FacGroup- c(c1,c3,t2)
#Second function
#f- function()

head(f(listaNew,FacGroup))
# Seq    Mod z c1 c3 t2
#1 aAATATAGPR 1-n_acPro/ 2  0  0  1
#2  aAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
#3   aAGAAGGR 1-n_acPro/ 2  0  0  1
#4  aAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
#5    AAALQAK    2  0  1  1
#6 aAGAGPEMVR 1-n_acPro/ 2  0  0  2


A.K.




From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, March 7, 2013 7:12 AM
Subject: Re: new question


Hi.

Sorry again a question about this, but when I run this code I have this error:

Error in `[.data.table`(x1, , `:=`(spec, paste(spec, collapse = ,)),  : 
  Type of RHS ('character') must match LHS ('integer'). To check and coerce 
would impact performance too much for the fastest cases. Either change the type 
of the target column, or coerce the RHS of := yourself (e.g. by using 1L 
instead of 1)

Could you help me to with this? How can I eliminate this?

Thank you




2013/2/28 arun smartpink...@yahoo.com

Hi,
directory- /home/arunksa111/data.new
#first function
filelist-function(directory,number,list1){
setwd(directory)
filelist1-dir(directory)

direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
list1-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
\t,stringsAsFactors=FALSE))
names(list1)-filelist1
list2- list(filelist1,list1)
return(list2)
}
foldernames1-filelist(directory,23,list1)[[1]]
foldernames1
#[1] a1 c1 c2 c3 t1 t2
lista-filelist(directory,23,list1)[[2]] #lista output

FacGroup- c(c1,c3,t2)

#Second function
f-function(listRes,Toselect){
res2-split(listRes,gsub([0-9],,names(listRes)))
res3-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) 
x[x[[FDR]]0.01,c(Seq,Mod,z,spec)]))
res4-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]])
res4New- lapply(res4,function(x) lapply(names(x), function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
library(plyr)
library(data.table)
res5-lapply(res4New,function(x) lapply(x,function(x1){ x1- 
data.table(x1);x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))
res6- lapply(res5,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
 
res7-lapply(res6,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x))
 res8-res7[lapply(res7,length)!=0]
 res9- Reduce(function(...) merge(...,by=c(Seq,Mod,z),all=TRUE),res8)
res9[is.na(res9)] - 0
return(res9)
}

f(lista,FacGroup)
 head(f(lista,FacGroup))
 #    Seq    Mod z c1 c3 t2
#1 aAATATAGPR 1-n_acPro/ 2  0  0  1
#2  aAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
#3   aAGAAGGR 1-n_acPro/ 2  0  0  1
#4  aAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
#5    AAALQAK    2  0  1  1
#6 aAGAGPEMVR 1-n_acPro/ 2  0  0  2

resCounts- f(lista,FacGroup)
t.test.p.value - function(...) {
    obj-try(t.test(...), silent=TRUE)
    if (is(obj, try-error)) return(NA) else return(obj$p.value)
 }

#3rd function for p-value
fpv- function(Countdata){
resNew-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub([0-9],,names(Countdata)[4:ncol(Countdata)])),
 function(i) {x-if(ncol(Countdata[i])1) rowSums(Countdata[i]) else 
Countdata[i]; colnames(x)-NULL;x}))
indx-combn(names(resNew),2)
resPval-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) 
{x-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); 
colnames(x)-paste(Pvalue,paste(indx[,i],collapse=),sep=_);x}))
resF-cbind(resCounts,resPval)
resF
}

fpv(resCounts)


A.K.







From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Thursday, February 28, 2013 11:30 AM
Subject: new question



Sorry about my question, but I need a new small thing...I need to split my 
function to read data and to do the treatment of the data.

At first I need to know the names of the files and read data, and after a 
new function with my analysis.

Re: [R] new question

2013-02-28 Thread arun
Hi,
directory- /home/arunksa111/data.new
#first function
filelist-function(directory,number,list1){
setwd(directory)
filelist1-dir(directory)
direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
list1-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
\t,stringsAsFactors=FALSE))
names(list1)-filelist1
list2- list(filelist1,list1)
return(list2)
}
foldernames1-filelist(directory,23,list1)[[1]]
foldernames1
#[1] a1 c1 c2 c3 t1 t2
lista-filelist(directory,23,list1)[[2]] #lista output

FacGroup- c(c1,c3,t2)

#Second function
f-function(listRes,Toselect){
res2-split(listRes,gsub([0-9],,names(listRes)))
res3-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) 
x[x[[FDR]]0.01,c(Seq,Mod,z,spec)]))
res4-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]])
res4New- lapply(res4,function(x) lapply(names(x), function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
library(plyr) 
library(data.table) 
res5-lapply(res4New,function(x) lapply(x,function(x1){ x1- 
data.table(x1);x1[,spec:=paste(spec,collapse=,),by=c(Seq,Mod,z)]}))
res6- lapply(res5,function(x) lapply(x,function(x1) 
{x1$counts-sapply(x1$spec, function(x2) length(gsub(\\s, , 
unlist(strsplit(x2, ,);x3-as.data.frame(x1);names(x3)[6]- 
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
 
res7-lapply(res6,function(x) Reduce(function(...) 
merge(...,by=c(Seq,Mod,z),all=TRUE),x)) 
 res8-res7[lapply(res7,length)!=0] 
 res9- Reduce(function(...) merge(...,by=c(Seq,Mod,z),all=TRUE),res8) 
res9[is.na(res9)] - 0
return(res9)
}

f(lista,FacGroup)
 head(f(lista,FacGroup))
 #    Seq    Mod z c1 c3 t2
#1 aAATATAGPR 1-n_acPro/ 2  0  0  1
#2  aAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
#3   aAGAAGGR 1-n_acPro/ 2  0  0  1
#4  aAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
#5    AAALQAK    2  0  1  1
#6 aAGAGPEMVR 1-n_acPro/ 2  0  0  2

resCounts- f(lista,FacGroup)
t.test.p.value - function(...) { 
    obj-try(t.test(...), silent=TRUE) 
    if (is(obj, try-error)) return(NA) else return(obj$p.value) 
 }

#3rd function for p-value
fpv- function(Countdata){
resNew-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub([0-9],,names(Countdata)[4:ncol(Countdata)])),
 function(i) {x-if(ncol(Countdata[i])1) rowSums(Countdata[i]) else 
Countdata[i]; colnames(x)-NULL;x}))
indx-combn(names(resNew),2) 
resPval-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) 
{x-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); 
colnames(x)-paste(Pvalue,paste(indx[,i],collapse=),sep=_);x})) 
resF-cbind(resCounts,resPval)
resF
}

fpv(resCounts)


A.K.






From: Vera Costa veracosta...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, February 28, 2013 11:30 AM
Subject: new question


Sorry about my question, but I need a new small thing...I need to split my 
function to read data and to do the treatment of the data.

At first I need to know the names of the files and read data, and after a new 
function with my analysis.

So, I did this

directory-C:/Users/Vera Costa/Desktop/data.new 
filelist-function(directory,number){
setwd(directory)
filelist-dir(directory)
return(filelist)
direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), 
full.names = FALSE, recursive = TRUE)
lista-lapply(direct, function(x) read.table(x,header=TRUE, sep = \t))
names(lista)-filelist
return(lista)
}
filelist(directory,23)


###a1 a2 c1 c2 c3 t1 t2



and after 

f-function(filelist,FacGroup){

res2-split(lista,names(lista))
 res3- lapply(res2,function(x) 
{names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x})
res3
#Freq FDR0.01
 res4-lapply(seq_along(res3),function(i) lapply(res3[[i]],function(x) 
x[x[[FDR]]0.01,c(Seq,Mod,z,spec)]))
 names(res4)- names(res2)
 res4
  res4New-lapply(res4,function(x) lapply(names(x),function(i) 
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
 res5- lapply(res4New,function(x) if(length(x)1) tail(x,-1) else NULL)
 library(plyr)
 library(data.table)
 res6- lapply(res5,function(x) lapply(x,function(x1) {x1-data.table(x1); 
x1[,spec:=past




How can I ask lista in second function? Could you help me? 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New question

2011-12-13 Thread Jorge I Velez
Hi,

Are you sure? That's not what I got:

 require(epicalc)
 ?logistic.display
  model0 - glm(case ~ induced + spontaneous, family=binomial, data=infert)
 logistic.display(model0)

Logistic regression predicting case

 crude OR(95%CI)  adj. OR(95%CI)P(Wald's test)
induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042

spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001

 P(LR-test)
induced (cont. var.) 0.042

spontaneous (cont. var.)  0.001

Log-likelihood = -139.806
No. of observations = 248
AIC value = 285.612


My impression is that you did something else and you are not telling us the
full story.  Here is my sessionInfo():

R version 2.14.0 Patched (2011-11-12 r57642)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] US.UTF-8

attached base packages:
[1] grid  splines   stats graphics  grDevices utils datasets
 methods
[9] base

other attached packages:
 [1] ggplot2_0.8.9proto_0.3-9.2reshape_0.8.4plyr_1.6
 [5] epicalc_2.14.0.0 nnet_7.3-1   MASS_7.3-16  survival_2.36-10
 [9] maptools_0.8-10  lattice_0.20-0   foreign_0.8-47   sp_0.9-91
[13] maps_2.2-2

loaded via a namespace (and not attached):
[1] tools_2.14.0
*
*


Could you please share your sessionInfo() as well as your OS, i.e., see
http://www.r-project.org/posting-guide.html?

HTH,
Jorge.-


2011/12/13 孟欣 

 According to the example of logistic.display:
 model0 - glm(case ~ induced + spontaneous, family=binomial, data=infert)
 summary(model0)
  logistic.display(model0)

  induced: 3levels 0,1,2
  spontaneous: 3levels 0,1,2

 So if 0 is reference, we should get 2 OR for
  induced1, induced2, spontaneous1, spontaneous2

 But the acturally OR is as the following,which is not what I expected:

  crude OR(95%CI)  adj. OR(95%CI)P(Wald's test)
 P(LR-test)
 induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042
  0.042
 spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001
   0.001


 Can anyone give me some suggestions?

 Many thanks!






 Hi sir:
 I follow your suggestion:

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)
 logistic.display(result)

 Error in coeff[, 1] : incorrect number of dimensions






 At 2011-12-14 01:59:36,Jorge I Velez  wrote:

 Hi there,

 Try

 require(epicalc)
 logistic.display(result)

 HTH,
 Jorge

 On Tue, Dec 13, 2011 at 7:16 AM, 孟欣  wrote:

 Hi all:
 My data has 3 variables:
 age(3levels : 30y=1  30-50y=2, 50y=3)
 gender(Male=0, Female=1)
 CD4 cell count(raw lab measurement)
 y(1:death  0:alive)

 I perform logistic regression to find out the factors that influence y.

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)

 From the result,I can get OR(Odds Ratio) of gender  via exp(Estimate of
 Female, since Male is regarded as reference group and has no result).But
 how can I compute the 95%CI of OR of gender?


 Thanks a lot for your help.

 My best!




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New question

2011-12-13 Thread Jorge I Velez
I forgot to mention (sorry for double posting) that str(infert) shows that
induced and spontaneous are not factors:

'data.frame': 248 obs. of  8 variables:
 $ education : Factor w/ 3 levels 0-5yrs,6-11yrs,..: 1 1 1 1 2 2 2
2 2 2 ...
 $ age   : num  26 42 39 34 35 36 23 32 21 28 ...
 $ parity: num  6 1 6 4 3 4 1 2 1 2 ...
 $ induced   : num  1 1 2 2 1 2 0 0 0 0 ...
 $ case  : num  1 1 1 1 1 1 1 1 1 1 ...
 $ spontaneous   : num  2 0 0 0 1 1 0 0 1 0 ...
 $ stratum   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ pooled.stratum: num  3 1 4 2 32 36 6 22 5 19 ...

This explains why you did not see reference levels in those variables.

HTH,
Jorge.-



On Tue, Dec 13, 2011 at 11:48 PM, Jorge I Velez  wrote:

 Hi,

 Are you sure? That's not what I got:

  require(epicalc)
  ?logistic.display
   model0 - glm(case ~ induced + spontaneous, family=binomial,
 data=infert)
  logistic.display(model0)

 Logistic regression predicting case

  crude OR(95%CI)  adj. OR(95%CI)P(Wald's test)
 induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042

 spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001

  P(LR-test)
  induced (cont. var.) 0.042

 spontaneous (cont. var.)  0.001

 Log-likelihood = -139.806
 No. of observations = 248
 AIC value = 285.612


 My impression is that you did something else and you are not telling us
 the full story.  Here is my sessionInfo():

 R version 2.14.0 Patched (2011-11-12 r57642)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

 locale:
 [1] US.UTF-8

 attached base packages:
 [1] grid  splines   stats graphics  grDevices utils datasets
  methods
 [9] base

 other attached packages:
  [1] ggplot2_0.8.9proto_0.3-9.2reshape_0.8.4plyr_1.6
  [5] epicalc_2.14.0.0 nnet_7.3-1   MASS_7.3-16  survival_2.36-10
  [9] maptools_0.8-10  lattice_0.20-0   foreign_0.8-47   sp_0.9-91
 [13] maps_2.2-2

 loaded via a namespace (and not attached):
 [1] tools_2.14.0
 *
 *


 Could you please share your sessionInfo() as well as your OS, i.e., see
 http://www.r-project.org/posting-guide.html?

 HTH,
 Jorge.-


 2011/12/13 孟欣 

  According to the example of logistic.display:
 model0 - glm(case ~ induced + spontaneous, family=binomial, data=infert)
 summary(model0)
  logistic.display(model0)

  induced: 3levels 0,1,2
  spontaneous: 3levels 0,1,2

 So if 0 is reference, we should get 2 OR for
  induced1, induced2, spontaneous1, spontaneous2

 But the acturally OR is as the following,which is not what I expected:

  crude OR(95%CI)  adj. OR(95%CI)P(Wald's
 test) P(LR-test)
 induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042
  0.042
 spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001
   0.001


 Can anyone give me some suggestions?

 Many thanks!






 Hi sir:
 I follow your suggestion:

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)
 logistic.display(result)

 Error in coeff[, 1] : incorrect number of dimensions






 At 2011-12-14 01:59:36,Jorge I Velez  wrote:

 Hi there,

 Try

 require(epicalc)
 logistic.display(result)

 HTH,
 Jorge

 On Tue, Dec 13, 2011 at 7:16 AM, 孟欣  wrote:

 Hi all:
 My data has 3 variables:
 age(3levels : 30y=1  30-50y=2, 50y=3)
 gender(Male=0, Female=1)
 CD4 cell count(raw lab measurement)
 y(1:death  0:alive)

 I perform logistic regression to find out the factors that influence y.

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)

 From the result,I can get OR(Odds Ratio) of gender  via exp(Estimate of
 Female, since Male is regarded as reference group and has no result).But
 how can I compute the 95%CI of OR of gender?


 Thanks a lot for your help.

 My best!




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.









[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] New question

2011-12-13 Thread Jorge I Velez
Hi lm_mengxin,

If that's the case, just use as.factor():

  fit - glm(case ~ as.factor(induced) + as.factor(spontaneous),
family=binomial, data=infert)
  logistic.display(fit)

   OR lower95ci upper95ci Pr(|Z|)
as.factor(induced)1  1.585398 0.7972313  3.152769 1.69e-01
as.factor(induced)2  2.281856 0.9784062  5.321787 5.620567e-02
as.factor(spontaneous)1  3.630192 1.8855353  6.989152 1.145482e-04
as.factor(spontaneous)2 10.525317 4.045 24.926241 8.745128e-08

Also, take a look at ?factor and ?glm.

HTH,
Jorge.-


2011/12/13 孟欣 

 Yes,you are right.

 But if I wanna treat induced and spontaneous as factors, how can I
 get the corresponding OR?





 At 2011-12-14 12:54:30,Jorge I Velez  wrote:

 I forgot to mention (sorry for double posting) that str(infert) shows
 that induced and spontaneous are not factors:

 'data.frame': 248 obs. of  8 variables:
  $ education : Factor w/ 3 levels 0-5yrs,6-11yrs,..: 1 1 1 1 2 2 2
 2 2 2 ...
  $ age   : num  26 42 39 34 35 36 23 32 21 28 ...
  $ parity: num  6 1 6 4 3 4 1 2 1 2 ...
  $ induced   : num  1 1 2 2 1 2 0 0 0 0 ...
  $ case  : num  1 1 1 1 1 1 1 1 1 1 ...
  $ spontaneous   : num  2 0 0 0 1 1 0 0 1 0 ...
  $ stratum   : int  1 2 3 4 5 6 7 8 9 10 ...
  $ pooled.stratum: num  3 1 4 2 32 36 6 22 5 19 ...

 This explains why you did not see reference levels in those variables.

 HTH,
 Jorge.-



 On Tue, Dec 13, 2011 at 11:48 PM, Jorge I Velez  wrote:

 Hi,

 Are you sure? That's not what I got:

  require(epicalc)
  ?logistic.display
   model0 - glm(case ~ induced + spontaneous, family=binomial,
 data=infert)
  logistic.display(model0)

 Logistic regression predicting case

  crude OR(95%CI)  adj. OR(95%CI)P(Wald's test)
 induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042

 spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001

  P(LR-test)
  induced (cont. var.) 0.042

 spontaneous (cont. var.)  0.001

 Log-likelihood = -139.806
 No. of observations = 248
 AIC value = 285.612


 My impression is that you did something else and you are not telling us
 the full story.  Here is my sessionInfo():

 R version 2.14.0 Patched (2011-11-12 r57642)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

 locale:
 [1] US.UTF-8

 attached base packages:
 [1] grid  splines   stats graphics  grDevices utils datasets
  methods
 [9] base

 other attached packages:
  [1] ggplot2_0.8.9proto_0.3-9.2reshape_0.8.4plyr_1.6
  [5] epicalc_2.14.0.0 nnet_7.3-1   MASS_7.3-16  survival_2.36-10
  [9] maptools_0.8-10  lattice_0.20-0   foreign_0.8-47   sp_0.9-91
 [13] maps_2.2-2

 loaded via a namespace (and not attached):
 [1] tools_2.14.0
 *
 *


 Could you please share your sessionInfo() as well as your OS, i.e., see
 http://www.r-project.org/posting-guide.html?

 HTH,
 Jorge.-


 2011/12/13 孟欣 

  According to the example of logistic.display:
 model0 - glm(case ~ induced + spontaneous, family=binomial,
 data=infert)
 summary(model0)
  logistic.display(model0)

  induced: 3levels 0,1,2
  spontaneous: 3levels 0,1,2

 So if 0 is reference, we should get 2 OR for
  induced1, induced2, spontaneous1, spontaneous2

 But the acturally OR is as the following,which is not what I expected:

  crude OR(95%CI)  adj. OR(95%CI)P(Wald's
 test) P(LR-test)
 induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042
0.042
 spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001
 0.001


 Can anyone give me some suggestions?

 Many thanks!






 Hi sir:
 I follow your suggestion:

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)
 logistic.display(result)

 Error in coeff[, 1] : incorrect number of dimensions






 At 2011-12-14 01:59:36,Jorge I Velez  wrote:

 Hi there,

 Try

 require(epicalc)
 logistic.display(result)

 HTH,
 Jorge

 On Tue, Dec 13, 2011 at 7:16 AM, 孟欣  wrote:

 Hi all:
 My data has 3 variables:
 age(3levels : 30y=1  30-50y=2, 50y=3)
 gender(Male=0, Female=1)
 CD4 cell count(raw lab measurement)
 y(1:death  0:alive)

 I perform logistic regression to find out the factors that influence y.

 result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)

 From the result,I can get OR(Odds Ratio) of gender  via exp(Estimate
 of Female, since Male is regarded as reference group and has no result).But
 how can I compute the 95%CI of OR of gender?


 Thanks a lot for your help.

 My best!




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.












[[alternative HTML version deleted]]


Re: [R] New question

2011-12-13 Thread 孟欣
Yes,it works well.


Thanks for your help.



At 2011-12-14 13:06:14,Jorge I Velez jorgeivanve...@gmail.com wrote:

Hi lm_mengxin,


If that's the case, just use as.factor():


  fit - glm(case ~ as.factor(induced) + as.factor(spontaneous), 
 family=binomial, data=infert)
  logistic.display(fit)
 
   OR lower95ci upper95ci Pr(|Z|)
as.factor(induced)1  1.585398 0.7972313  3.152769 1.69e-01
as.factor(induced)2  2.281856 0.9784062  5.321787 5.620567e-02
as.factor(spontaneous)1  3.630192 1.8855353  6.989152 1.145482e-04
as.factor(spontaneous)2 10.525317 4.045 24.926241 8.745128e-08


Also, take a look at ?factor and ?glm.


HTH,
Jorge.-




2011/12/13 ÃÏÐÀ 

Yes,you are right.


But if I wanna treat induced and spontaneous as factors, how can I get the 
corresponding OR?








At 2011-12-14 12:54:30,Jorge I Velez  wrote:
I forgot to mention (sorry for double posting) that str(infert) shows that 
induced and spontaneous are not factors:


'data.frame':248 obs. of  8 variables:
 $ education : Factor w/ 3 levels 0-5yrs,6-11yrs,..: 1 1 1 1 2 2 2 2 2 
2 ...
 $ age   : num  26 42 39 34 35 36 23 32 21 28 ...
 $ parity: num  6 1 6 4 3 4 1 2 1 2 ...
 $ induced   : num  1 1 2 2 1 2 0 0 0 0 ...
 $ case  : num  1 1 1 1 1 1 1 1 1 1 ...
 $ spontaneous   : num  2 0 0 0 1 1 0 0 1 0 ...
 $ stratum   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ pooled.stratum: num  3 1 4 2 32 36 6 22 5 19 ...


This explains why you did not see reference levels in those variables.


HTH,
Jorge.-




On Tue, Dec 13, 2011 at 11:48 PM, Jorge I Velez  wrote:
Hi, 


Are you sure? That's not what I got:


 require(epicalc)
 ?logistic.display
  model0 - glm(case ~ induced + spontaneous, family=binomial, data=infert)
 logistic.display(model0)


Logistic regression predicting case 
 
 crude OR(95%CI)  adj. OR(95%CI)P(Wald's test)
induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042 
  
spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001   
  
 P(LR-test)
induced (cont. var.) 0.042 
   
spontaneous (cont. var.)  0.001   
   
Log-likelihood = -139.806
No. of observations = 248
AIC value = 285.612




My impression is that you did something else and you are not telling us the 
full story.  Here is my sessionInfo():


R version 2.14.0 Patched (2011-11-12 r57642)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)


locale:
[1] US.UTF-8


attached base packages:
[1] grid  splines   stats graphics  grDevices utils datasets  
methods  
[9] base 


other attached packages:
 [1] ggplot2_0.8.9proto_0.3-9.2reshape_0.8.4plyr_1.6
 [5] epicalc_2.14.0.0 nnet_7.3-1   MASS_7.3-16  survival_2.36-10
 [9] maptools_0.8-10  lattice_0.20-0   foreign_0.8-47   sp_0.9-91   
[13] maps_2.2-2  


loaded via a namespace (and not attached):
[1] tools_2.14.0




Could you please share your sessionInfo() as well as your OS, i.e., see 
http://www.r-project.org/posting-guide.html?


HTH,
Jorge.-



2011/12/13 ÃÏÐÀ 

According to the example of logistic.display:
model0 - glm(case ~ induced + spontaneous, family=binomial, data=infert) 
summary(model0)
 logistic.display(model0)


 induced: 3levels 0,1,2
 spontaneous: 3levels 0,1,2


So if 0 is reference, we should get 2 OR for  induced1, induced2, 
spontaneous1, spontaneous2


But the acturally OR is as the following,which is not what I expected:


 crude OR(95%CI)  adj. OR(95%CI)P(Wald's test) 
P(LR-test)
induced (cont. var.) 1.05 (0.74,1.5)  1.52 (1.02,2.27)  0.042  
0.042  
spontaneous (cont. var.) 2.9 (1.97,4.26)  3.31 (2.19,5.01)   0.001 
0.001   




Can anyone give me some suggestions?


Many thanks!











Hi sir:
I follow your suggestion:


result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)
logistic.display(result)


Error in coeff[, 1] : incorrect number of dimensions











At 2011-12-14 01:59:36,Jorge I Velez  wrote:
Hi there,


Try


require(epicalc)
logistic.display(result)


HTH,
Jorge


On Tue, Dec 13, 2011 at 7:16 AM, ÃÏÐÀ  wrote:
Hi all:
My data has 3 variables:
age(3levels : 30y=1  30-50y=2, 50y=3)
gender(Male=0, Female=1)
CD4 cell count(raw lab measurement)
y(1:death  0:alive)

I perform logistic regression to find out the factors that influence y.

result-glm(y ~ factor(age) + factor(gender) + CD4,family = binomial)

From the result,I can get OR(Odds Ratio) of gender  via exp(Estimate of 
Female, since Male is regarded as reference group and has no result).But how 
can I compute the 95%CI of OR of gender?


Thanks a lot for your help.

My best!




   [[alternative HTML version deleted]]