Hi Arun Massive thanks for the hints of making use of 'paste0'!
But coincidentally there were no pair of data exactly same in indxTem1 and indxTem2 in the previous example. I changed data as below which is very likely to be in my real data... V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data index with lots of repeated numeric values V2<-c(1:23, 6,7,11,4,5,6,7) # there are also duplicated values in V2 Tem1<-cbind(V1,V2) Tem2<-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1... And my target outcome is the difference between Tem1 and Tem2 as below: V1 V2 333 12 111 16 111 17 111 20 222 21 222 22 222 23 222 6 222 7 333 11 333 4 333 5 333 6 333 7 Many thanks HJ On Wed, Mar 6, 2013 at 9:29 PM, arun <smartpink...@yahoo.com> wrote: > > > Hi, > How about this: > > indxTem1<-paste0(Tem1[,1],Tem1[,2]) > indxTem2<-paste0(Tem2[,1],Tem2[,2]) > Tem1[!indxTem1%in%indxTem2,] > # V1 V2 > #[1,] 333 11 > #[2,] 111 16 > #[3,] 111 17 > #[4,] 111 20 > #[5,] 222 21 > #[6,] 222 22 > #[7,] 222 23 > #[8,] 222 1 > #[9,] 222 2 > #[10,] 333 3 > #[11,] 333 4 > #[12,] 333 5 > #[13,] 333 6 > #[14,] 333 7 > > > A.K. > ________________________________ > From: HJ YAN <yhj...@googlemail.com> > To: arun <smartpink...@yahoo.com> > Cc: r-help@r-project.org > Sent: Wednesday, March 6, 2013 4:09 PM > Subject: Re: [R] How to combine conditional argument and logical argument > in R to create subset of data... > > > Dear Arun > > > Thanks a million for your prompt reply and I love all four ways in your > reply. > > Tried the code and just realised an issue here: in my real work, my data > is about 4GB large and I'm sure that there are many duplicated values in > V2, so that is to say my V1 and V2 should be something like > > > V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) # V1 here are some data > index with lots of repeated numeric values > V2<-c(1:23, 1:7) # there are also duplicated values in V2 > Tem1<-cbind(V1,V2) > Tem2<-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1... > > > So how do I get outcome of the difference of Tem1 and Tem2 if the values > in V2 having duplicates? > > V1 V2 > 333 11 > 111 16 > 111 17 > 111 20 > 222 21 > 222 22 > 222 23 > 222 1 > 222 2 > 333 3 > 333 4 > 333 5 > 333 6 > 333 7 > > > Massive thanks > HJ > > > > > > On Wed, Mar 6, 2013 at 4:12 PM, arun <smartpink...@yahoo.com> wrote: > > > > > >Just to add: > > > >Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),] > > > >A.K. > > > >----- Original Message ----- > > > >From: arun <smartpink...@yahoo.com> > >To: HJ YAN <yhj...@googlemail.com> > >Cc: R help <r-help@r-project.org> > >Sent: Wednesday, March 6, 2013 11:06 AM > >Subject: Re: [R] How to combine conditional argument and logical argument > in R to create subset of data... > > > >Hi, > >No problem. > >V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) > > length(V1) > >#[1] 30 > > > > V2<- c(1:30) #should be the same length as V1 > >Tem1<- cbind(V1,V2) > >Tem2<-Tem1[1:20,] > > > >Tem1[!Tem1[,2]%in%Tem2[,2],] > > # V1 V2 > > #[1,] 222 21 > > #[2,] 222 22 > > #[3,] 222 23 > > #[4,] 222 24 > > #[5,] 222 25 > > #[6,] 333 26 > > #[7,] 333 27 > > #[8,] 333 28 > > #[9,] 333 29 > >#[10,] 333 30 > > > >#or > >subset(Tem1,!V2%in% Tem2[,2]) > >#or > > Tem1[is.na(match(Tem1[,2],Tem2[,2])),] > > # V1 V2 > > #[1,] 222 21 > > #[2,] 222 22 > > #[3,] 222 23 > > #[4,] 222 24 > > #[5,] 222 25 > > #[6,] 333 26 > > #[7,] 333 27 > > #[8,] 333 28 > > #[9,] 333 29 > >#[10,] 333 30 > >A.K. > > > > > > > > > >________________________________ > >From: HJ YAN <yhj...@googlemail.com> > >To: arun <smartpink...@yahoo.com> > >Sent: Wednesday, March 6, 2013 10:33 AM > >Subject: Re: [R] How to combine conditional argument and logical argument > in R to create subset of data... > > > > > >Thank you SO MUCH Arun!!! > > > >That's brilliant-- I've learnt some very useful new R command now, e.g. > 'do.call' and 'split'. And I see where my code went wrong now. > > > > I do appreciate greatly for your prompt reply. > > > >Also, I wonder if there exist a package can find difference between two > data frames, e.g. one is a subset of the other? e.g. > > > > V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2) > > V2<-c(1:23) > >Tem1<-cbind(V1,V2) > > > >Tem2<-Tem1[1:20,] > > > > > >How do I get outcome like > > > >[21,] 333 21 > >[22,] 333 22 > >[23,] 333 23 > > > > > >P.S. I used 'setdiff' before, but seems it only works for vectors but not > for dataframe?? > > > > > >Sorry for so many questions today, as I'm coding for a work deadline > tonight. > > > > > >Many thanks! > >Cheers > >HJ > > > > > > > > > > > > > > > >On Wed, Mar 6, 2013 at 1:55 PM, arun <smartpink...@yahoo.com> wrote: > > > >Hi, > >>You can also try this: > >> Tem3<- list() > >> for(i in unique(Tem1[,1])) { > >> Tem3[[i]]<- subset(Tem1,Tem1[,1]==i) > >> Tem4<- do.call(rbind,Tem3) > >> } > >>head(Tem4) > >># V1 V2 > >>#[1,] 111 1 > >>#[2,] 111 2 > >>#[3,] 111 3 > >>#[4,] 111 4 > >>#[5,] 111 13 > >>#[6,] 111 14 > >> > >> > >>#or > >>Tem3<-c(NA,NA) > >> for(i in unique(Tem1[,1])) { > >> Tem2<- subset(Tem1, Tem1[,1]==i) > >> Tem3<- rbind(Tem3,Tem2) > >> Tem5<- Tem3[-1,] > >> } > >>head(Tem5) > >># V1 V2 > >># 111 1 > >># 111 2 > >># 111 3 > >># 111 4 > >># 111 13 > >># 111 14 > >> > >>A.K. > >> > >> > >>________________________________ > >>From: HJ YAN <yhj...@googlemail.com> > >> > >>To: arun <smartpink...@yahoo.com> > >>Cc: r-help@r-project.org > >>Sent: Wednesday, March 6, 2013 8:24 AM > >>Subject: Re: [R] How to combine conditional argument and logical > argument in R to create subset of data... > >> > >> > >> > >>Hi Arun > >> > >> > >>Thank you so much for the help, that's really helpful!! > >> > >>Also I have a quick question about the code below where I can not see > why it doesn't work... > >> > >>I know the I shou > >> > >>V1<-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3)) > >>V2<-c(1:23) > >>Tem1<-cbind(V1,V2) > >> > >> > >>So Tem 1 looks like... > >>> Tem1 > >> V1 V2 > >> [1,] 111 1 > >> [2,] 111 2 > >> [3,] 111 3 > >> [4,] 111 4 > >> [5,] 222 5 > >> [6,] 222 6 > >> [7,] 222 7 > >> [8,] 222 8 > >> [9,] 333 9 > >>[10,] 333 10 > >>[11,] 333 11 > >>[12,] 333 12 > >>[13,] 111 13 > >>[14,] 111 14 > >>[15,] 111 15 > >>[16,] 111 16 > >>[17,] 222 17 > >>[18,] 222 18 > >>[19,] 222 19 > >>[20,] 222 20 > >>[21,] 333 21 > >>[22,] 333 22 > >>[23,] 333 23 > >> > >>I would like the outcome to be... > >> > >> V1 V2 > >> > >> 111 1 > >> 111 2 > >> 111 3 > >> 111 4 > >> 111 13 > >> 111 14 > >> 111 15 > >> 111 16 > >> 222 5 > >> 222 6 > >> 222 7 > >> 222 8 > >> 222 17 > >> 222 18 > >> 222 19 > >> 222 20 > >> 333 9 > >> 333 10 > >> 333 11 > >> 333 12 > >> 333 21 > >> 333 22 > >> 333 23 > >> > >> > >>So I tried code as below > >>------------------------------------------ > >>Tem3<-c(NA,NA) > >>for(i in length(unique(Tem1[,1]))){ > >>Tem2<-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i]) > >>Tem3<-rbind(Tem3,Tem2) > >>Tem3 > >>} > >>Tem4<-Tem3[-1,] > >>--------------------------------------- > >> > >>And only get this... > >> > >> > >> V1 V2 > >> 333 9 > >> 333 10 > >> 333 11 > >> 333 12 > >> 333 21 > >> 333 22 > >> 333 23 > >> > >> > >>I tried to run the code step by step, e.g. letting i=1, then i=2, then > i= 3, and updating my Tem3, I did get what I wanted, but wondered why in > the loop above it did not work...?? > >> > >> > >>Many thanks in advance! > >> > >>HJ > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>On Wed, Mar 6, 2013 at 4:36 AM, arun <smartpink...@yahoo.com> wrote: > >> > >>Hi, > >>> > >>> b[b[,4]>15 & (b[,1]>4|is.na(b[,1])) & (b[,2]>4|is.na(b[,2])),] > >>> # [,1] [,2] [,3] [,4] [,5] > >>>#[1,] 6 NA NA 16 20 > >>>#[2,] NA 5 NA 17 21 > >>>A.K. > >>> > >>> > >>> > >>>----- Original Message ----- > >>>From: HJ YAN <yhj...@googlemail.com> > >>>To: r-help@r-project.org > >>>Cc: > >>>Sent: Tuesday, March 5, 2013 9:33 PM > >>>Subject: [R] How to combine conditional argument and logical argument > in R to create subset of data... > >>> > >>>Dear R user > >>> > >>>I have data created using code below > >>> > >>>b<-matrix(2:21,nrow=4) > >>>b[,1:3]=NA > >>>b[4,2]=5 > >>>b[3,1]=6 > >>> > >>>Now the data is > >>> > >>>> b > >>> [,1] [,2] [,3] [,4] [,5] > >>>[1,] NA NA NA 14 18 > >>>[2,] NA NA NA 15 19 > >>>[3,] 6 NA NA 16 20 > >>>[4,] NA 5 NA 17 21 > >>> > >>> > >>>I want to keep data in column 4 greater than 15 and the value in column > 1 & > >>>2 either greater than 4 or is 'NA'. So I would like to have > >>>my outcome as below... > >>> > >>>[3,] 6 NA NA 16 20 > >>>[4,] NA 5 NA 17 21 > >>> > >>>I thought something like the code below gonna to work but it only > returns > >>>the last row,e.g "NA 5 NA 17 21". ... > >>> > >>>bb<-b[which( (b[,2]>4 | b[,2]==NA) & (b[,1]>4 | b[,1]==NA) & b[,4]>15) > ,]) > >>> > >>> > >>>Please could anyone help? > >>> > >>>Many thanks in advance > >>> > >>>HJ > >>> > >>> [[alternative HTML version deleted]] > >>> > >>>______________________________________________ > >>>R-help@r-project.org mailing list > >>>https://stat.ethz.ch/mailman/listinfo/r-help > >>>PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>>and provide commented, minimal, self-contained, reproducible code. > >>> > >>> > >> > > > >______________________________________________ > >R-help@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.