Hi Henrique,

I tried your code. I simply copied and pasted it 'cause I have no idea how it 
works. What I get is the total number of A's and T's and all other characters, 
which was not my intention. Maybe I need to make some modifications to your 
script before being able to apply within my script? Can you explain what for 
are you using those commands?

Thanks for the help anyway.

Cheers,

Carlos




________________________________
From: Henrique Dallazuanna <www...@gmail.com>

Cc: r-help@r-project.org
Sent: Thursday, August 27, 2009 7:00:45 PM
Subject: Re: [R] subset of a matrix

Try this:

lapply(data, 
       function(r)
            lapply(split(r, 
                         substr(sprintf("%05d", as.numeric(gsub("[a-z]", "", 
row.names(r)))), 1, 3)), table))


On Thu, Aug 27, 2009 at 1:27 PM, Carlos Gonzalo Merino Mendez <carlosgmerin

>Hello everyone, I would appreciate any help with the following.
>
>>My dataset is a list containing matrices. So if you type e.g.
>
>>data[[1]]
>
>>you get something like:
>
>>           [,1]    [,2]
>>361a       A    T
>>456b       A    G
>>72145a    T    G
>>........
>
>>As you can see my rows have names which are character strings containing 
>>numbers and letters. I want something similar to a histogram, per column. 
>>i.e. I want to know how many times I have a single repeat character in a 
>>column and how many times I have a twice repeated character and so on. Maybe 
>>there is an easy way to do this, but I wrote my own code which works 
>>perfectly, so don't bother to correct it unless extremely necessary. I write 
>>down the code so you know exactly what I'm trying to do:
>
>>table <- vector()
>
>>for (i in (1:length(data))){
>
>>    for (j in (1:length(data[[i]][1,]))){
>
>>        t <- table(data[[i]][,j])
>
>>        table <- c(table, t)
>>}}
>
>>ncount <- table[names(table) != "-"] #this line is necessary to eliminate "-" 
>>characters which should not be included in the analysis
>
>>sfs <- table (ncount)
>
>>And with this code I get something like:
>
>> 1   2   3   4   5   6   7   8   9  10 ....
>
>>542 125  98  49  47  41  26  31  22  18  ....
>
>>which is what I'm looking for.
>
>
>>Now comes THE problem:
>
>>As I said before my rows have names. Each name is unique. I want to apply my 
>>analysis to a subset of rows en each matrix, namely all rows whose names 
>>start with 3, all that start with 4, all that start with 721. In most cases 
>>only the first character is important, but since I have names of different 
>>length, in some cases I need the first three characters to differentiate the 
>>groups. I want to integrate this into the loop so that I get a vector (such 
>>as the one called "table" in my code) for each subset analyzed.
>
>>I tried using the subset function, but I couldn't figure out how to use it, 
>>because it's intended to use row values to define the subset, not row names.
>
>>I hope someone can help me out, but please bear in mind I am really new at R 
>>and most commands and parameters are really unfamiliar to me.
>
>>Thanks.
>
>
>
>>        [[alternative HTML version deleted]]
>
>>______________________________________________
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



      
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to