Sake wrote: > > Hi, > > I'm heaving difficulties with a dataset containing gene names and > positions of those genes. > Not such a big problem, but each gene has multiple exons so it's hard to > say where de gene starts and where it ends. I want the starting and ending > position of each gene in my dataset. > Attached is the dataset: > http://www.nabble.com/file/p21312449/genlistchrompos.csv > genlistchrompos.csv > Column 'B' is the gene name, 'G' is the starting position and 'H' is the > stop position. > You can load the dataset by using: data<-read.csv("genlistchrompos.csv", > sep=";") > I hope someone can help me, it's giving me headaches for a week now:-((. > > Thanks! > >
Thanks for the tips, i'm going to test them today! The B,G,H columns I mentioned are the columns you see when you open the file in Excel, I should have said that. Sorry for the confusion about that:-) I thought I had to use the 'if' statement because I only want to search for the Min and Max if the Gene name is the same as the one directly under it. And the 'for loop' I wanted to use to apply the 'if' statement to the entire row of gene names. Edit: I have tested: aggregate(data[, c("Exon_Start.Chr.")], by = list(data$Gene), min) aggregate(data[, c("Exon_Stop.Chr.")], by = list(data$Gene), max) And it worked like a charm! thanx! -- View this message in context: http://www.nabble.com/for-loop-and-if-problem-tp21312449p21326557.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.