Hi Carlos,

I think I made a wrong suggestion. Sorry about that.
I was thinking that if you have the same rowname length it helps you on the
data handling. Is it true?! Case yes I can try suggest another automatic way
of you get it.


bests

milton



On Thu, Aug 27, 2009 at 12:39 PM, milton ruser <milton.ru...@gmail.com>wrote:

> Hi Carlos,
>
> how about this step first:
>
> rownames(mydata)<-gsub("361a","00361a",rownames(mydata))
> rownames(mydata)<-gsub("456a","00456a",rownames(mydata))
>
> good luck
>
> milton
>   On Thu, Aug 27, 2009 at 12:27 PM, Carlos Gonzalo Merino Mendez <
> carlosgmer...@yahoo.com> wrote:
>
>> Hello everyone, I would appreciate any help with the following.
>>
>> My dataset is a list containing matrices. So if you type e.g.
>>
>> data[[1]]
>>
>> you get something like:
>>
>>           [,1]    [,2]
>> 361a       A    T
>> 456b       A    G
>> 72145a    T    G
>> ........
>>
>> As you can see my rows have names which are character strings containing
>> numbers and letters. I want something similar to a histogram, per column.
>> i.e. I want to know how many times I have a single repeat character in a
>> column and how many times I have a twice repeated character and so on. Maybe
>> there is an easy way to do this, but I wrote my own code which works
>> perfectly, so don't bother to correct it unless extremely necessary. I write
>> down the code so you know exactly what I'm trying to do:
>>
>> table <- vector()
>>
>> for (i in (1:length(data))){
>>
>>    for (j in (1:length(data[[i]][1,]))){
>>
>>        t <- table(data[[i]][,j])
>>
>>        table <- c(table, t)
>> }}
>>
>> ncount <- table[names(table) != "-"] #this line is necessary to eliminate
>> "-" characters which should not be included in the analysis
>>
>> sfs <- table (ncount)
>>
>> And with this code I get something like:
>>
>>  1   2   3   4   5   6   7   8   9  10 ....
>>
>> 542 125  98  49  47  41  26  31  22  18  ....
>>
>> which is what I'm looking for.
>>
>>
>> Now comes THE problem:
>>
>> As I said before my rows have names. Each name is unique. I want to apply
>> my analysis to a subset of rows en each matrix, namely all rows whose names
>> start with 3, all that start with 4, all that start with 721. In most cases
>> only the first character is important, but since I have names of different
>> length, in some cases I need the first three characters to differentiate the
>> groups. I want to integrate this into the loop so that I get a vector (such
>> as the one called "table" in my code) for each subset analyzed.
>>
>> I tried using the subset function, but I couldn't figure out how to use
>> it, because it's intended to use row values to define the subset, not row
>> names.
>>
>> I hope someone can help me out, but please bear in mind I am really new at
>> R and most commands and parameters are really unfamiliar to me.
>>
>> Thanks.
>>
>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to