Re: [R] R help for creating expression data of Differentially expressed genes

arun Tue, 07 May 2013 15:36:27 -0700

HI,
Assuming that "out_dat.txt" is the output you expected.


 dat1<- read.table("data1.txt",header=TRUE,stringsAsFactors=FALSE)
dat2<- read.table("data2.txt",header=TRUE,stringsAsFactors=FALSE)
out_dat<- read.table("out_data.txt",header=TRUE,stringsAsFactors=FALSE)
 out_dat2<-merge(dat1[,1:4],dat2,by="ID")
 identical(out_dat,out_dat2)
#[1] TRUE
A.K.





________________________________
From: Vivek Das <vd4mm...@gmail.com>
To: arun <smartpink...@yahoo.com> 
Cc: R help <r-help@r-project.org> 
Sent: Tuesday, May 7, 2013 6:07 PM
Subject: Re: R help for creating expression data of Differentially expressed 
genes



HI Arun,

My data sets are as in the provided files. I am providing the sample files. I 
guess this will give a better idea to the type of working I want to do with the 
two files and the kind or script am trying to write. Hope you can give me some 
suggestions regarding this. I am new to R so having trouble to use different 
functions to use this for my working.

Anyone who can help me out with this can be of great help.



----------------------------------------------------------

Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy

emails: vivek....@ieo.eu
            vchris...@yahoo.co.in
            vd4mm...@gmail.com



On Tue, May 7, 2013 at 10:36 PM, arun <smartpink...@yahoo.com> wrote:

Hi Vivek,
>
>May be this helps:
>set.seed(35)
> dat1<- cbind(ID=1:8, 
>as.data.frame(matrix(sample(1:50,8*7,replace=TRUE),ncol=7)))
>
>set.seed(38)
>dat2<- cbind(ID= sample(1:20,8,replace=FALSE), 
>as.data.frame(matrix(sample(1:50,8*33,replace=TRUE),ncol=33)))
>colnames(dat2)[-1]<-gsub("V","X",colnames(dat2)[-1])
> merge(dat1[,1:2],dat2[,1:31],by="ID")
>#  ID V1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20
>#1  1 43 44  4 33 47 29 43 31 15  2  34  42   5  18  22  36  34  44   3  45   9
>#2  3 28  4 18 45 24  5 20 30 16 49  34  33   5  24  49  31  10  45  21  26  20
>#3  6  5 16  1  5  2 26  6 40 16 15  50  26  37  22  25  39  16  24  29  50  42
>#4  7 25 26 39 16 29  5 40 15 27 46  16  38  36  42   8   3  29   7  13  18  38
>#5  8 30  3 41 25 38 24 41 44 23  2  45  33  10  18  20  49  19  23  42  25   5
>#  X21 X22 X23 X24 X25 X26 X27 X28 X29 X30
>#1  14  27   3  21   6  44  33  42  10  29
>#2  48  13   8  47  18   9  23   9  44   3
>#3  25  14  31  19  14   6  26  13   6  49
>#4  43  28  15   6   9  19  43  21  41  21
>#5   1  27  18   3  42   5  16  39  46  47
>
>A.K.
>
>
>
>----- Original Message -----
>
>From: Vivek Das <vd4mm...@gmail.com>
>To: arun <smartpink...@yahoo.com>
>Cc:
>
>Sent: Tuesday, May 7, 2013 3:45 PM
>Subject: R help for creating expression data of Differentially expressed genes
>
>Hi Arun,
>
>I need some help regarding R scripting. I have two data file one containing 
>seven columns and the other containing 33. Both files have unique identifier 
>as ID. I want to create another file which should have the first two columns 
>of the first file and and the 31 columns of the second file matched on the 
>basis of ID. The first file is having gene I'd and gene names of around 500 
>and I want the output file which is having all of those and other attributes 
>as well. I want to get the output file having all attributes matching with the 
>I'd of the first file. So that I get output of 500 rows with all the 
>attributes of second file. I am new to R but having trouble with merge 
>function in R. If you can help it will be great.
>
>Regards,
>Vivek
>
>Sent from my iPad
>
>On 07/mag/2013, at 21:13, arun <smartpink...@yahoo.com> wrote:
>
>> HI Ye,
>>
>> For the NA in ID column,
>>
>>
>>
>> Hi
>> dat1<- read.table(text="
>> ObsNumber     ID          Weight
>>      1                 0001         12
>>      2                 0001          13
>>      3                 0001           14
>>      4                  0002         16
>>       5                 0002         17
>>      6                   N/A          18 
>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"),na.strings="N/A")
>>  unlist(lapply(split(dat1,dat1$ID),function(x) 
>>with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>> #[1] "0001_1" "0001_2" "0001_3" "0002_1" "0002_2"
>> A.K.
>> ________________________________
>> From: Ye Lin <ye...@lbl.gov>
>> To: arun <smartpink...@yahoo.com>
>> Cc: R help <r-help@r-project.org>
>> Sent: Tuesday, May 7, 2013 2:54 PM
>> Subject: Re: [R] create unique ID for each group
>>
>>
>>
>> Thanks A.K. But I have "NA" in ID column, so when I apply the code, it gives 
>> me error saying the replacement as less rows than the data has. Anyway for 
>> ID=N/A, return sth like "N/A_1" in order as well?
>>
>>
>>
>>
>>
>>
>> On Tue, May 7, 2013 at 11:17 AM, arun <smartpink...@yahoo.com> wrote:
>>
>> H,
>>> Sorry, a mistake:
>>> dat1$UniqueID<-unlist(lapply(split(dat1,dat1$ID),function(x) 
>>> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>>> dat1
>>>  # ObsNumber   ID Weight UniqueID
>>> #1         1 0001     12   0001_1
>>> #2         2 0001     13   0001_2
>>> #3         3 0001     14   0001_3
>>> #4         4 0002     16   0002_1
>>> #5         5 0002     17   0002_2
>>>
>>> dat2$UniqueID<-unlist(lapply(split(dat2,dat2$ID),function(x) 
>>> with(x,as.character(interaction(ID,seq_len(nrow(x)),sep="_")))),use.names=FALSE)
>>>
>>> A.K.
>>>
>>>
>>>
>>>
>>>
>>> ----- Original Message -----
>>>
>>> From: arun <smartpink...@yahoo.com>
>>> To: Ye Lin <ye...@lbl.gov>
>>> Cc: R help <r-help@r-project.org>
>>> Sent: Tuesday, May 7, 2013 2:10 PM
>>> Subject: Re: [R] create unique ID for each group
>>>
>>>
>>>
>>> Hi,
>>>
>>> Try this:
>>> dat1<- read.table(text="
>>> ObsNumber     ID          Weight
>>>      1                 0001         12
>>>      2                 0001          13
>>>      3                 0001           14
>>>      4                  0002         16
>>>       5                 0002         17
>>> ",sep="",header=TRUE,colClass=c("numeric","character","numeric"))
>>> dat2<- read.table(text="
>>> ID               Height
>>> 0001            3.2
>>> 0001             2.6
>>> 0001             3.2
>>> 0002             2.2
>>> 0002              2.6
>>> ",sep="",header=TRUE,colClass=c("character","numeric"))
>>> dat1$UniqueID<-with(dat1,as.character(interaction(ID,ObsNumber,sep="_")))
>>>  
>>>dat2$UniqueID<-with(dat2,as.character(interaction(ID,rownames(dat2),sep="_")))
>>>  dat2
>>> #    ID Height UniqueID
>>> #1 0001    3.2   0001_1
>>> #2 0001    2.6   0001_2
>>> #3 0001    3.2   0001_3
>>> #4 0002    2.2   0002_4
>>> #5 0002    2.6   0002_5
>>> A.K.
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Ye Lin <ye...@lbl.gov>
>>> To: R help <r-help@r-project.org>
>>> Cc:
>>> Sent: Tuesday, May 7, 2013 1:54 PM
>>> Subject: [R] create unique ID for each group
>>>
>>> Hey All,
>>>
>>> I have a dataset(dat1) like this:
>>>
>>> ObsNumber     ID          Weight
>>>      1                 0001         12
>>>      2                 0001          13
>>>      3                 0001           14
>>>      4                  0002         16
>>>       5                 0002         17
>>>
>>> And another dataset(dat2) like this:
>>>
>>> ID               Height
>>> 0001            3.2
>>> 0001             2.6
>>> 0001             3.2
>>> 0002             2.2
>>> 0002              2.6
>>>
>>> I want to merge dat1 and dat2 based on "ID" in order, I know "match" only
>>> returns the first match it finds. So I am thinking create unique ID col in
>>> dat2 and dat2, then merge. But I dont know how to do that so it can be like
>>> this:
>>>
>>> dat1:
>>>
>>> ObsNumber     ID          Weight  UniqueID
>>>      1                 0001         12         0001_1
>>>      2                 0001          13        0001_2
>>>      3                 0001           14       0001_3
>>>      4                  0002         16         0002_1
>>>       5                 0002         17         0002_1
>>>
>>> dat2:
>>>
>>> ID               Height   UniqueID
>>> 0001            3.2          0001_1
>>> 0001             2.6         0001_2
>>> 0001             3.2         0001_3
>>> 0002             2.2         0002_1
>>> 0002              2.6        0002_2
>>>
>>> Or if it is possible to merge dat1 and dat2 by matching "ID" but return the
>>> match in order that would be great!
>>>
>>> Thanks for your help!
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R help for creating expression data of Differentially expressed genes

Reply via email to