Hi I am attempting to use plsr which is part of the pls package in r. I amconducting analysis on datasets to identify which proteins/peptides are responsible for the variance between sample groups (Biomarker Spoting) in a multivariate fashion.
I have a dataset in R called "FullDataListTrans". as you can see below the structure of the data is 40 different rows representing a sample and 94,272 columns each representing a peptide. >str(FullDataListTrans) num [1:40, 1:94727] 42 40.9 65 56 61.7 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:40] "X" "X.1" "X.12" "X.13" ... ..$ : NULL I have also created a vector "GroupingList" which gives the groupnames for each respective sample(row). > GroupingList [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 [39] 4 4 > str(GroupingList) int [1:40] 1 1 1 1 1 1 1 1 1 1 ... I am now stuck while conducting the plsr. I have tried various methods of creating structured lists etc and have got nowhere. I have also tried many incarnations of BHPLS1 <- plsr(GroupingList ~ PCIList, ncomp = FeaturePresenceExpected[1], data = FullDataListTrans, validation = "LOO") Where am I going wrong. Also what is the easiest method to identify which of the 94,000 peptides are most important to the variance between groups. Thanks in advance for any help Amit Patel [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.