Hi > > Honestly thank you for the prompt responding > and you are right I will tellyou what I want to do > and not the way ..since I dont know much from R > > > I have a txt with Proteins > > "Prot_10035" "Func_0005874" "Func_0016787" "Func_0003774" "Func_0006898" > "Func_0005856" "Func_0005525" "Func_0005737" "Func_0003924" "Func_0005515" > "Func_0000166" > "Prot_10036" "Func_0005739" "Func_0003735" "Func_0006412" "Func_0005763" > "Func_0005840" > "Prot_10037" "Func_0005739" "Func_0005515" > "Prot_10039" "Func_0005576" "Func_0009615" "Func_0050832" "Func_0005615" > "Func_0006955" "Func_0042742" "Func_0031640" "Func_0006935" > "Prot_1004" "Func_0046872" "Func_0003887" "Func_0003684" "Func_0016740" > "Func_0006281" "Func_0006260" "Func_0016779" "Func_0005634" > "Prot_10040" "Func_0005886" "Func_0046488" "Func_0016301" "Func_0007409" > "Func_0005524" "Func_0016740" "Func_0016308" "Func_0000166" > > which is 8527 lines and 145 columns (not all the proteins have the same > number of proteins) functions?
First of all you need to read this file into R properly. I would try readLines with some further polishing to feed list structure with protein names as labels for each part of a list. After that some cycle/lapply checking with regular expression could be a way to populate a data frame with protein names in first column and score in the second. After that you can compare such score with other values in another data frame. However without an example you hardly get detailed help. Regards Petr > What I want is to predict whether those proteins are related to cancer or > not > depending on whether they have some functions. I found that there are 3 > functions very often related to cancer > and in case a protein has 2/3 or 3/3 to "label" it (somehow-maybe adding an > extra column) as cancer related > The names of the Proteins are always in the 1st column but the names of the > functions can be at any of the next columns > > So what I did is to use this loop, but I cant write properly the way I want > it to print the results so to use them again > (I need to know the name of the proteins having the functions in a column so > as next step to compare it with another file > -test data set- and conclude to true positive, false positive, true > negative, false negative > > It cant be as hard as I see it :):) > > -- > View this message in context: http://r.789695.n4.nabble.com/Writing-to-a- > file-tp3070617p4363940.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.