On Jan 2, 2010, at 12:55 AM, che wrote:
i know it would be better to ask R to make the data, but i need to
sequence
this particular file, because it is data for some Amino Acids and i
cant
play with, so i need to ask R to go through the sequence one by one,
and
then give me the numbers of each letters of each sequence, i am quite
confused between using "i" and "j" and how to iterate both of them
and make
them work functionally. i attached the sequence.txt with my original
message, and i will attach it here in case. thanks for your help.
http://n4.nabble.com/file/n997087/sequence.txt sequence.txt
Sorry. I did not read to the very end. My apologies, hopefully the
following
oneliner will make up for my dereliction of attention.
che wrote:
may some one please help me to sort this out, i am trying to writ a
R code
for calculating the frequencies of the amino acids in 9 different
sequences, i want the code to read the sequence from external text
file, i
used the following code to do so:
x<-read.table("sequence.txt",header=FALSE)
then i defined an array for 20 amino acids as following:
AA<-
c
('A
','C
','D
','E
','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y')
i am using the following code to calculate the frequencies:
After copy-pasting the sequences from a browser window to a character
object, "seqnc", I then processed it:
> seqlines <- readLines(textConnection(seqnc))
# Then for the first sequence:
> table(strsplit(seqlines[1], vector()) )
A D E F G I K L M N P Q R S T V W Y
21 25 28 27 24 34 39 31 11 20 16 10 17 25 22 33 3 15
# For "mass production": The names that resulted from my first effort
were a bit
unwieldly ( > 200 characters long) so I unnamed it:
unname( sapply(seqlines, function(x) table(strsplit(x, vector() ) ) ) )
[[1]]
A D E F G I K L M N P Q R S T V W Y
21 25 28 27 24 34 39 31 11 20 16 10 17 25 22 33 3 15
[[2]]
A C D E F G H I K L M N P Q R S T V W Y
34 5 15 25 6 35 7 24 23 32 9 12 15 10 17 14 13 36 2 13
[[3]]
A C D E F G H I K L M N P Q R S T V W Y
33 5 17 24 7 36 7 24 24 32 9 13 14 9 17 12 14 36 2 12
[[4]]
A C D E F G H I K L M N P Q R S T V W Y
33 5 16 25 5 35 6 24 23 33 8 12 15 9 17 17 12 35 2 15
[[5]]
A C D E F G H I K L M N P Q R S T V W Y
33 4 15 6 21 30 3 19 23 22 8 8 8 14 17 14 12 24 5 12
[[6]]
A C D E F G H I K L M N P Q R S T V W Y
30 3 13 4 16 22 2 17 16 17 6 6 7 11 15 11 12 18 3 11
[[7]]
A C D E F G H I K L M N P Q R S T V W Y
39 5 21 8 22 39 2 23 29 25 10 8 7 13 22 14 21 25 7 16
[[8]]
A C D E F G H I K L M N P Q R S T V W Y
34 4 17 6 19 30 2 20 24 21 8 7 7 12 17 14 16 21 5 14
[[9]]
A C D E F G H I K L M N P Q R S T V W Y
35 4 17 6 18 31 3 20 23 21 8 7 7 12 18 12 17 21 5 13
[[10]]
A
5
--
David.
frequency<-function(X)
{
y<-rep(0,20)
for(j in 1:nchar(as.character(x$V1[i]))){
for(i in 1:9){
res<-which(AA==substr(x$V1[i],j,j))
y[res]=y[res]+1
}
}
return(y)
}
but this code actually is not working, it reads only one sequence,
i dont
know why the loop is not working for the "i", which suppose to read
the
nine rows of the file sequence.txt. the sequence.txt file is
attached to
this message.
cheers
http://n4.nabble.com/file/n997072/sequence.txt sequence.txt
--
View this message in context:
http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997087.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.