[R] solution

2010-08-19 Thread Alex Levitchi
Hello 
I am not sure about your general aim, but from my previous experience on 
combination of treatments, it can be more useful to create a matrix or 
data.frame, which you can fulfill either with 1s and 0s, or with the values of 
treatment, especially if you will need this data later for the analysis. 

I can propose a small function, but only for the case of pair selection 

treat=function(x,y){ 
data.frame(Treatment=x[-which(x==y)], Comparator=rep(y,times=length(x)-1)) 
} 

 treat(c(t1,t2,t3,t4),t2) 
Treatment Comparator 
1 t1 t2 
2 t3 t2 
3 t4 t2 

In this case you can define the list of names of treatments ant the comparator 
each time you need, and you will get the result as a data frame in the way you 
asked. 

Good luck 

Alex Levitchi 
PhD in Genetics, 
Bioinformatician at Laboratory of Bioinformatics 
CBM, Area Science Park, Trieste, Italy 
http://www.cbm.fvg.it/laboratories/bioinformatics_research 

scientific researcher, 
Center of Molecular Biology, 
University of Academy of Sciences of Moldova 
www.edu.asm.md 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I associate a list of defined names with the dataframes to be downloaded

2010-02-16 Thread Alex Levitchi
Hello 
I am very thankful for the reply from Jim Holtman and David Winsemius, 
especially for the understandable explanations. it really works. 
Now I get another problem I cannot figure out. 
That is the situation: 
I work in biology. I need to download several files according to an experiment, 
which can be find out in NCBI GEO, and store them. 
For this I use GEOquery package and getGEO function. 
Each experiment (named GSE) contains several Samples (GSM), which names I 
extract by names(GSMList()). 
So now I want to make association between the defined names, which represents 
lowercases names of GSM, having them from names(GSMList()) as 
class(names(GSMList(gse))) 
[1] character 

I wrote something like this 
lapply=(i=1:length(names(GSMList(gse))), 
lgsms[i]=getGEO(names(GSMList(gse))[i])) 
but 
Error: unexpected ','  or anything else 
if I try to make it directly by associating a name from the list to 
correspondent GEO file I get this 
lgsms[1]=getGEO(names(GSMList(gse))[1]) #lgsms - list of lowercases names from 
names(GSMList()) 
File stored at: 
/tmp/RtmpgnMuHv/GSM296650.soft # so it downloaded the file but didn't make the 
association with the name so I cannot use it. 
Error in lgsms[1] = getGEO(names(GSMList(gse))[1]) : 
incompatible types (from S4 to character) in subassignment type fix 

Generally, working with GEOquery should be done very careful, as previously I 
also have some problems regarding the characteristics of data extracted from it 
and the way to convert them in an affordable way. It is a pity, that authors 
don't give more explanations on it. So I suppose it is also here. 

Kind regards 
Alex Levitchi 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I rearange my dataframe

2010-02-09 Thread Alex Levitchi
Hello 
I am recently began to work with R, so I am not so experienced. 
But anyway I cannot find a clear way to process my dataframe which is a bigger 
one. 
It shows similar to this 

 name=c(A,B,C,B,C,C,C,B,C) 
 nicknames=c(A1,B1,C1,B2,C2,C3,C4,B3,C5) 
 value=c(4,5,9,2,7,6,3,6,7) 
 table=data.frame(cbind(name,nickname,value)) 
 table=data.frame(cbind(name,nicknames,value)) 
 table 
name nicknames value 
1 A A1 4 
2 B B1 5 
3 C C1 9 
4 B B2 2 
5 C C2 7 
6 C C3 6 
7 C C4 3 
8 B B3 6 
9 C C5 7 

So I have to rearrange it in the next way: 
- the first column should contain just unduplicated data, I did this, it is OK 
and it will look like 
1 A 
2 B 
3 C 

- the second column should contain different 'nicknames' which correspond to 
the single A, B or C 
name nickname value 
1 A A1 
2 B B1,B2,B3 
3 C C1,C2,C3,C4,C5 

-the third one should contain the mean value of the numbers which correspond to 
the same A, B or C 
1 A A1 mean(4) 
2 B B1,B2,B3 mean(5,2,6) 
3 C C1,C2,C3,C4,C5 mean(9,7,6,3,7) 

I did this using a loop 'for'. 
to be clear I created tree dataframes which correspond to each of columns, and 
finally will combine them 

 ulist=which(!duplicated(table$name)) # I extract the list of positions in 
 which I don't have duplications 
 name1=data.frame(table$name[ulist]) # I extract the list of unique names 
 nicknames1=data.frame(row.names(1:length(ulist))) # I create a dataframe of 
 dimension equal to unique list length 
 value1=data.frame(row.names(1:length(ulist))) # I create a dataframe of 
 dimension equal to unique list length 

 for(i in 1:length(ulist)) { 
position=which(as.character(name1[i,1])==table$name) 
nicknames1[i,1]=toString(table$nicknames[position]) 
value1[i,1]=mean(as.numeric(table$value[position])) 
} 
 fin=cbind(name1,nicknames1,value1) 
 colnames(fin)=c(NAME,NICKNAME,VALUE) 
 fin 
NAME NICKNAME VALUE 
1 A A1 3.00 
2 B B1, B2, B3 3.33 
3 C C1, C2, C3, C4, C5 5.20 

it works successfully. But in general I work with dataframes of high dimensions 
(tens thousands or more rows). 
So my loop works too slow (i.e., a dataframe of 2 rows and 3 columns is 
processed in about 10 minutes). 
I intend to integrate it into a function, so it is obvious that time will be 
even longer. 

If someone can advise me any possibility to modify which I have done or to the 
way I can do it, please give me a message. 

King regards to all guys who develop and maintain R sources for such dummies as 
me 
Alex Levitchi 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.