Re: [R] a workaround for indexing a function?
I noticed a missing line in the code associated with my question. I've re-supplied the question now with that line. Sorry for the trouble, Adam I have a dataset of peptide 15N/14N ratios from four experiments (where cells are grown under different conditions and were fed either 15N or 14N nitrate aka plant food if you are interested) and I want to normalize each of these ratios to the median 15N/14N value from all 6000 peptides from each experiment. Then, since these ratios (from 2 or more peptides) are used to calculate the protein 15N/14N abundance, I want to take the median value of all peptides belonging to one protein to calculate the overall protein 15N/14N abundance within that experiment. Iâve managed to break the data set into small pieces (by protein and by experiment), which will lead to about 500 observations. A dummy â small â dataset could look like this (below), where V1 lists the peptides that make up a protein, V2 is the protein ID, V3 is the metric of interest (the 15N/14N ratio of each peptide), and V4 indicates which experiment was run. The stumbling block here is that I cannot index the median function. Here is what I have done so far. y-read.table(ânameoffile.txtâ) h-split(y,y$V4) for(n in names(h)) write.table(h[[n]], file=paste(y[[n]][,âV4â][4],â_â,n,â.TXTâ,sep=ââ)) Then I (not elegantly but OK with brute force method for four files) went into each file and calculated the median vls1-read.table(âVLS5584_VLS5584.TXTâ) # for example, but did this four times for each file newcol=vls1[,3]/median(vls1[,3]) #make a new column of 15N/14N data now normalized to median value⦠here this is shown for one of four experiments. normvls1=cbind(vls1,newcol) #added to existing data with global median normalized numbers. nvls1prot-split(normvls1,normvls1$V2) #splits up this experiment into many files.. one for each protein. for(n in names(nvls1prot))write.table(nvls1prot[[n]],file=paste(nvls1prot[[n]][,V4][1],_, n,.TXT,sep=)) #writes these files Then I took this output in windows explorer and moved it to a new folder. Set working directory to that folder. And, lifted from a help page and tweaked to my liking.. # batch import text files (files must be in working directory); 'pattern' is case-sensitive txtfiles = list.files(pattern=*.TXT) for (i in 1:length(txtfiles)){ tmp = read.table(txtfiles[i]) xyz = data.frame(tmp) normHL = xyz[[5]] median[i] = median(normHL) } But after that, I get the following error message. Error in median[i] = median(normHL) : object of type 'closure' is not subsettable Thanks, Adam Kustka SEAEFVNYQVR B5YLU3 0.226 VLS5584 LLQTEPGTR B5YLU3 0.199 VLS5584 SEAEFVNYQVR B5YLU3 0.216 VLS5585 LLQTEPGTR B5YLU3 0.183 VLS5585 SEAEFVNYQVR B5YLU3 0.266 VLS5586 FAQMAVLGFIIPEK B5YLU3 .2 VLS5586 LLQTEPGTR B5YLU3 0.203 VLS5586 SEAEFVNYQVR B5YLU3 0.516 VLS5587 FAQMAVLGFIIPEK B5YLU3 0.764 VLS5587 IEGLGWRPK B5YLU3 .2 VLS5587 LLQTEPGTR B5YLU3 0.338 VLS5587 LLQTEPGTR B5YLU3 .2 VLS5587 FSCAYLVDNPR B5YLV4 0.131 VLS5584 ITCQDPSDDFPTYR B5YLV4 0.122 VLS5584 SPLPEDLMPEFSFR B5YLV4 0.105 VLS5584 ITCQDPSDDFPTYR B5YLV4 .2 VLS5585 FLEVPMYLK B5YLV4 .2 VLS5585 ITCQDPSDDFPTYR B5YLV4 0.113 VLS5585 ITCQDPSDDFPTYR B5YLV4 0.112 VLS5586 FSCAYLVDNPR B5YLV4 0.128 VLS5586 SPLPEDLMPEFSFR B5YLV4 0.126 VLS5586 FLEVPMYLKCLDR B5YLV4 0.406 VLS5586 LYYNVETLK B5YLV4 0.11 VLS5586 GEDGYLTTK B5YLV4 0.144 VLS5586 FLEVPMYLK B5YLV4 0.134 VLS5586 ITCQDPSDDFPTYR B5YLV4 .2 VLS5586 FSCAYLVDNPR B5YLV4 .2 VLS5586 FKDDFFLK B5YLV4 0.121 VLS5586 ITCQDPSDDFPTYRDLEK B5YLV4 0.138 VLS5586 FLEVPMYLK B5YLV4 0.123 VLS5586 CFDDAFVR B5YLV4 0.15 VLS5586 VFFTHGMYYTGGNLVAQVK B5YLV4 0.25 VLS5586 FKDDFFLK B5YLV4 0.141 VLS5586 CGFDEVHAR B5YLV4 0.108 VLS5586 YPENYNIEELPAAGQSYIHPDTYVQR B5YLV4 0.153 VLS5587 SPLPEDLMPEFSFR B5YLV4 0.153 VLS5587 FSCAYLVDNPR B5YLV4 0.137 VLS5587 LYYNVETLK B5YLV4 0.157 VLS5587 CGFDEVHAR B5YLV4 0.145 VLS5587 FLEVPMYLK B5YLV4 0.142 VLS5587 On Wed, Jun 18, 2014 at 9:12 AM, Adam Kustka kus...@andromeda.rutgers.edu wrote: I have a dataset of peptide 15N/14N ratios from four experiments (where cells are grown under different conditions and were fed either 15N or 14N nitrate aka plant food if you are interested) and I want to normalize each of these ratios to the median 15N/14N value from all 6000 peptides from each experiment. Then, since these ratios (from 2 or more peptides) are used to calculate the protein 15N/14N abundance, I want to take the median value of all peptides belonging to one protein to calculate the overall protein 15N/14N abundance within that experiment. Iâve managed to break the data set into small pieces (by protein and by experiment), which will lead to about 500 observations. A dummy â
[R] a workaround for indexing a function?
I have a dataset of peptide 15N/14N ratios from four experiments (where cells are grown under different conditions and were fed either 15N or 14N nitrate aka plant food if you are interested) and I want to normalize each of these ratios to the median 15N/14N value from all 6000 peptides from each experiment. Then, since these ratios (from 2 or more peptides) are used to calculate the protein 15N/14N abundance, I want to take the median value of all peptides belonging to one protein to calculate the overall protein 15N/14N abundance within that experiment. Iâve managed to break the data set into small pieces (by protein and by experiment), which will lead to about 500 observations. A dummy â small â dataset could look like this (below), where V1 lists the peptides that make up a protein, V2 is the protein ID, V3 is the metric of interest (the 15N/14N ratio of each peptide), and V4 indicates which experiment was run. The stumbling block here is that I cannot index the median function. Here is what I have done so far. y-read.table(ânameoffile.txtâ) h-split(y,y$V4) for(n in names(h)) write.table(h[[n]], file=paste(y[[n]][,âV4â][4],â_â,n,â.TXTâ,sep=ââ)) Then I (not elegantly but OK with brute force method for four files) went into each file and calculated the median vls1-read.table(âVLS5584_VLS5584.TXTâ) # for example, but did this four times for each file newcol=vls1[,3]/median(vls1[,3]) normvls1=cbind(vls1,newcol) nvls1prot-split(normvls1,normvls1$V2) for(n in names(nvls1prot))write.table(nvls1prot[[n]],file=paste(nvls1prot[[n]][,V4][1],_, n,.TXT,sep=)) Then I took this output in windows explorer and moved it to a new folder. Set working directory to that folder. And, lifted from a help page and tweaked to my liking.. for (i in 1:length(txtfiles)){ tmp = read.table(txtfiles[i]) xyz = data.frame(tmp) normHL = xyz[[5]] median[i] = median(normHL) } But after that, I get the following error message. Error in median[i] = median(normHL) : object of type 'closure' is not subsettable Thanks, Adam Kustka SEAEFVNYQVR B5YLU3 0.226 VLS5584 LLQTEPGTR B5YLU3 0.199 VLS5584 SEAEFVNYQVR B5YLU3 0.216 VLS5585 LLQTEPGTR B5YLU3 0.183 VLS5585 SEAEFVNYQVR B5YLU3 0.266 VLS5586 FAQMAVLGFIIPEK B5YLU3 .2 VLS5586 LLQTEPGTR B5YLU3 0.203 VLS5586 SEAEFVNYQVR B5YLU3 0.516 VLS5587 FAQMAVLGFIIPEK B5YLU3 0.764 VLS5587 IEGLGWRPK B5YLU3 .2 VLS5587 LLQTEPGTR B5YLU3 0.338 VLS5587 LLQTEPGTR B5YLU3 .2 VLS5587 FSCAYLVDNPR B5YLV4 0.131 VLS5584 ITCQDPSDDFPTYR B5YLV4 0.122 VLS5584 SPLPEDLMPEFSFR B5YLV4 0.105 VLS5584 ITCQDPSDDFPTYR B5YLV4 .2 VLS5585 FLEVPMYLK B5YLV4 .2 VLS5585 ITCQDPSDDFPTYR B5YLV4 0.113 VLS5585 ITCQDPSDDFPTYR B5YLV4 0.112 VLS5586 FSCAYLVDNPR B5YLV4 0.128 VLS5586 SPLPEDLMPEFSFR B5YLV4 0.126 VLS5586 FLEVPMYLKCLDR B5YLV4 0.406 VLS5586 LYYNVETLK B5YLV4 0.11 VLS5586 GEDGYLTTK B5YLV4 0.144 VLS5586 FLEVPMYLK B5YLV4 0.134 VLS5586 ITCQDPSDDFPTYR B5YLV4 .2 VLS5586 FSCAYLVDNPR B5YLV4 .2 VLS5586 FKDDFFLK B5YLV4 0.121 VLS5586 ITCQDPSDDFPTYRDLEK B5YLV4 0.138 VLS5586 FLEVPMYLK B5YLV4 0.123 VLS5586 CFDDAFVR B5YLV4 0.15 VLS5586 VFFTHGMYYTGGNLVAQVK B5YLV4 0.25 VLS5586 FKDDFFLK B5YLV4 0.141 VLS5586 CGFDEVHAR B5YLV4 0.108 VLS5586 YPENYNIEELPAAGQSYIHPDTYVQR B5YLV4 0.153 VLS5587 SPLPEDLMPEFSFR B5YLV4 0.153 VLS5587 FSCAYLVDNPR B5YLV4 0.137 VLS5587 LYYNVETLK B5YLV4 0.157 VLS5587 CGFDEVHAR B5YLV4 0.145 VLS5587 FLEVPMYLK B5YLV4 0.142 VLS5587 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adonis output not significant even with dummy input?
I used adonis to test tested for differences in the relative species (OTU) abundances from incubation experiments with a common starting water source and duplicates for each of three treatments (a one way test with three levels in the independent variable). For full disclosure (but it may not be relevant) I came up with a metric to reflect change in rel species abundance from t0 to t9 days, weighted to not overly inflate significance of rare species. The Bray Curtis similarity table looked compelling (where A and B represent replicates). I was surprised to see p ~0.06. So, I made a dummy file with a blatant differences (dummy file below) and only obtained a p value of 0.05. Iâm clearly missing something here. BTW I also noticed if you purposefully mismatch your env file you can still get a âresultâ. I'm so grateful for your help!!! B-C output from real data (three treatments each replicated twice) St24FeA St24FeB St24NoFeA St24NoFeB St24EntA St24FeB 0.081774 St24NoFeA 0.322509 0.282081 St24NoFeB 0.343357 0.301258 0.064281 St24EntA 0.268338 0.221075 0.098628 0.161846 St24EntB 0.203852 0.194518 0.166727 0.221682 0.096111 Dummy file used as OTU1 OTU2 OTU3 OTU5 St24FeA 100 20 100 20 St24FeB 100 25 100 25 St24NoFeA 5 80 5 80 St24NoFeB 5 80 5 80 St24EntA 7 100 7 100 St24EntB 7 100 7 100 BC of dummy file (BCdum) is even more compelling than the real data (no transforms done for this). Output from adonis using BCdum as input. Call: adonis(formula = Bcdum ~ treatment data=st24.env, permutation=2000) Terms added sequentially (first to last) Df SumsOfSqs MeanSqs F.Model R2 Pr(F) treatment 2 0.73764 0.36882 5313.2 0.99972 0.05147 Residuals 3 0.00021 0.7 0.00028 Total 5 0.73785 1 The entire script is here. Note the âabundanceâ data is not transformed in R. Rather, I calculate a metric of change in percent abundance t= 9 d(counts(i)/counts(total) / t = 0 d(counts(i)/counts(total) which is then normalized so small abs changes in rare species donât have an undue influence. St24Y-read.table(st24relchangeforR.txt) BC24-vegdist(St24Y,method=âbrayâ,binary=FALSE, diag=FALSE, upper=FALSE,na.rm = FALSE) st24.env-read.table(st24envfileforR.txt) adonis(BC24~ treatment,st24.env,perm=2000) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.