Hi I am still working on my large dataset (sample attached) that contains a series of binary variables (flags, yes/no) regarding affected brain areas ("Lica","LAtChor","LA1" ,etc).
I need to scan these columns, if value = Y for "Lxxx" set "LesionSide" to "L", if Y for "Rxxx" set to "R" and "B" if both. There are >2500 records, so for-loops would be inefficient. Any suggestions? Much obliged. Best Jon Soli Deo Gloria
SubjID LesionSide Rica RAtChor RA1 RA2 RM1 RM2 RPcom RP1 RP2 RBas RSCb RAica RPica RPreFrot RFrot RPar ROcc RLatTemp RMesTemp RBG RThal RMidbrai RPos RCb RMedulla Lica LAtChor LA1 LA2 Acom LM1 LM2 LPcom LP1 LP2 LBas LVert RVert LSCb LAica LPica LPreFrot LFrot LPar LOcc LLatTemp LMesTemp LBG LThal LMidbrai LPos LCb LMedulla Single WhiteMatterDisease WhiteMatterSeverity delim 105 R Y 1 106 R Y Y Y 1 107 U 1 108 B Y Y 1 108 B Y Y 1 109 L Y Y 1 110 B 1 111 U 1 112 U 1 113 B Y Y Y 1 114 U 1 115 U 1 116 R 1 117 U 1 118 R 1 119 L Y 1 120 R Y Y 1 120 R Y Y 1 120 R 1 121 U Y Y 1 122 U Y Y 1 123 U 1 124 U 1 126 L 1 127 U 1 128 L 1 128 L 1 129 U 1 130 L Y 1 131 R Y 1 132 R 1 132 B 1 133 B Y Y 1 134 R 1 135 L 1 136 B 1 137 B Y Y Y Y Y Y 1 137 B 1 137 B 1 137 B 1 138 B Y Y Y Y Y Y 1 138 B Y Y Y Y Y Y 1 139 U Y Y Y Y Y Y 1 139 U Y Y Y Y 1 140 R Y Y Y Y 1 140 R 1 141 U 1 141 U 1 142 U 1 142 U Y Y Y Y Y Y Y Y 1 143 R Y Y Y Y Y Y Y Y 1 144 U 1 145 L 1 146 U Y Y Y Y Y Y 1 146 U Y 1 147 L Y Y 1 148 L 1 149 U 1 150 R Y 1
Jon Erik Ween, MD, MS Scientist, Kunin-Lunenfeld Applied Research Unit Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre Assistant Professor, Dept. of Medicine, Div. of Neurology University of Toronto Faculty of Medicine Kimel Family Building, 6th Floor, Room 644 Baycrest Centre 3560 Bathurst Street Toronto, Ontario M6A 2E1 Canada Phone: 416-785-2500 x3648 Fax: 416-785-2484 Email: jw...@klaru-baycrest.on.ca Confidential: This communication and any attachment(s) may contain confidential or privileged information and is intended solely for the address(es) or the entity representing the recipient(s). If you have received this information in error, you are hereby advised to destroy the document and any attachment(s), make no copies of same and inform the sender immediately of the error. Any unauthorized use or disclosure of this information is strictly prohibited. On 2010-12-09, at 2:21 PM, David Winsemius wrote: > > On Dec 9, 2010, at 12:44 PM, Sashi Challa wrote: > >> Hello All, >> >> I have a toy dataframe like this. It has 8 columns separated by tab. >> >> Name SampleID Al1 Al2 X Y R Th >> rs191191 A1 A B 0.999 0.09 0.78 0.090 >> abc928291 A1 B J 0.3838 0.3839 0.028 0.888 >> abcnab A1 H K 0.3939 0.939 0.3939 0.77 >> rx82922 B1 J K 0.3838 0.393 0.393 0.00 >> rcn3939 B1 M O 0.000 0.000 0.000 0.77 >> tcn39399 B1 P I 0.393 0.393 0.393 0.56 >> >> Note that the SampleID is repeating. So I want to be able to split the >> dataset based on the SampleID and write the splitted dataset of every >> SampleID into a new file. >> I tried split followed by lapply to do this. >> >> infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE) >> infile.split <- split(infile, infile$SampleID) >> names(infile.split[1]) ## outputs “A1” >> ## now A1, B1 are two lists in infile.split as I understand it. Correct me >> if I am wrong. >> >> lapply(infile.split,function(x){ >> filename <- names(x) #### here I expect to see A1 or B1, I >> didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1. >> final_filename <- paste(filename,”toy_set.txt”,sep=”_”) >> write.table(x, file = paste(path, final_filename,sep=”/”, >> row.names=FALSE, quote=FALSE,sep=”\t”) >> } ) >> >> In lapply I wanted to give a unique filename to all the split Sample Ids, >> i.e. name them here as <dragged to the c() construct>. >> How do I get those names, i.e. A1, B1 to a create a filename like above. > > names(file.split) <- c("A1_toy_set.txt", "B1_toy_set_txt") > >> When I write each of the element in the list obtained after split into a >> file, > > How are you proposing do do this "writing"? > >> the column names would have names like A1.Name, A1.SampleID, A1.Al1, ….. > > Are you sure? Why would you think that? > > -- > David. > >> Can I get rid of “A1” in the column names within the lapply (other than >> reading in the file again and changing the names) ? >> >> Thanks for your time, >> >> Regards >> Sashi >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.