Hi Arun

Massive thanks for the hints of making use of 'paste0'!

But coincidentally there were no pair of data exactly same in indxTem1 and
indxTem2 in the previous example. I changed data as below which is very
likely to be in my real data...


V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data
index with lots of repeated numeric values
V2<-c(1:23, 6,7,11,4,5,6,7)  # there are also duplicated values in V2
Tem1<-cbind(V1,V2)
Tem2<-Tem1[c(1:11,13:15,18:19),] # I know that Tem2 is a subset of Tem1...


And my target outcome is the difference between Tem1 and Tem2 as below:


  V1 V2

 333 12
 111 16
 111 17
 111 20
 222 21
 222 22
 222 23
 222  6
 222  7
 333 11
 333  4
 333  5
 333  6
 333  7

Many thanks
HJ



On Wed, Mar 6, 2013 at 9:29 PM, arun <smartpink...@yahoo.com> wrote:

>
>
> Hi,
> How about this:
>
> indxTem1<-paste0(Tem1[,1],Tem1[,2])
>  indxTem2<-paste0(Tem2[,1],Tem2[,2])
> Tem1[!indxTem1%in%indxTem2,]
> #       V1 V2
>  #[1,] 333 11
>  #[2,] 111 16
>  #[3,] 111 17
>  #[4,] 111 20
>  #[5,] 222 21
>  #[6,] 222 22
>  #[7,] 222 23
>  #[8,] 222  1
>  #[9,] 222  2
> #[10,] 333  3
> #[11,] 333  4
> #[12,] 333  5
> #[13,] 333  6
> #[14,] 333  7
>
>
> A.K.
> ________________________________
> From: HJ YAN <yhj...@googlemail.com>
> To: arun <smartpink...@yahoo.com>
> Cc: r-help@r-project.org
> Sent: Wednesday, March 6, 2013 4:09 PM
> Subject: Re: [R] How to combine conditional argument and logical argument
> in R to create subset of data...
>
>
> Dear Arun
>
>
> Thanks a million for your prompt reply and I love all four ways in your
> reply.
>
> Tried the code and just realised an issue here:   in my real work, my data
> is about 4GB large and I'm sure that there are many duplicated values in
> V2, so that is to say my V1 and V2 should be something like
>
>
> V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)  # V1 here are some data
> index with lots of repeated numeric values
> V2<-c(1:23, 1:7)  # there are also duplicated values in V2
> Tem1<-cbind(V1,V2)
> Tem2<-Tem1[c(1:10,12:15,18:19),] # I know that Tem2 is a subset of Tem1...
>
>
> So how do I get outcome of the difference of Tem1 and Tem2 if the values
> in V2 having duplicates?
>
>   V1 V2
>  333 11
>  111 16
>  111 17
>  111 20
>  222 21
>  222 22
>  222 23
>  222  1
>  222  2
>  333  3
>  333  4
>  333  5
>  333  6
>  333  7
>
>
> Massive thanks
> HJ
>
>
>
>
>
> On Wed, Mar 6, 2013 at 4:12 PM, arun <smartpink...@yahoo.com> wrote:
>
>
> >
> >Just to add:
> >
> >Tem1[Tem1[,2]%in%setdiff(Tem1[,2],Tem2[,2]),]
> >
> >A.K.
> >
> >----- Original Message -----
> >
> >From: arun <smartpink...@yahoo.com>
> >To: HJ YAN <yhj...@googlemail.com>
> >Cc: R help <r-help@r-project.org>
> >Sent: Wednesday, March 6, 2013 11:06 AM
> >Subject: Re: [R] How to combine conditional argument and logical argument
> in R to create subset of data...
> >
> >Hi,
> >No problem.
> >V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
> > length(V1)
> >#[1] 30
> >
> > V2<- c(1:30) #should be the same length as V1
> >Tem1<- cbind(V1,V2)
> >Tem2<-Tem1[1:20,]
> >
> >Tem1[!Tem1[,2]%in%Tem2[,2],]
> > #      V1 V2
> > #[1,] 222 21
> > #[2,] 222 22
> > #[3,] 222 23
> > #[4,] 222 24
> > #[5,] 222 25
> > #[6,] 333 26
> > #[7,] 333 27
> > #[8,] 333 28
> > #[9,] 333 29
> >#[10,] 333 30
> >
> >#or
> >subset(Tem1,!V2%in% Tem2[,2])
> >#or
> > Tem1[is.na(match(Tem1[,2],Tem2[,2])),]
> > #      V1 V2
> > #[1,] 222 21
> > #[2,] 222 22
> > #[3,] 222 23
> > #[4,] 222 24
> > #[5,] 222 25
> > #[6,] 333 26
> > #[7,] 333 27
> > #[8,] 333 28
> > #[9,] 333 29
> >#[10,] 333 30
> >A.K.
> >
> >
> >
> >
> >________________________________
> >From: HJ YAN <yhj...@googlemail.com>
> >To: arun <smartpink...@yahoo.com>
> >Sent: Wednesday, March 6, 2013 10:33 AM
> >Subject: Re: [R] How to combine conditional argument and logical argument
> in R to create subset of data...
> >
> >
> >Thank you SO MUCH Arun!!!
> >
> >That's brilliant-- I've learnt some very useful new R command now, e.g.
> 'do.call' and 'split'. And I see where my code went wrong now.
> >
> > I do appreciate greatly for your prompt reply.
> >
> >Also, I wonder if there exist a package can find difference between two
> data frames, e.g. one is a subset of the other? e.g.
> >
> > V1<-rep(c(rep(111,5),rep(222,5),rep(333,5)),2)
> > V2<-c(1:23)
> >Tem1<-cbind(V1,V2)
> >
> >Tem2<-Tem1[1:20,]
> >
> >
> >How do I get outcome like
> >
> >[21,] 333 21
> >[22,] 333 22
> >[23,] 333 23
> >
> >
> >P.S. I used 'setdiff' before, but seems it only works for vectors but not
> for dataframe??
> >
> >
> >Sorry for so many questions today, as I'm coding for a work deadline
> tonight.
> >
> >
> >Many thanks!
> >Cheers
> >HJ
> >
> >
> >
> >
> >
> >
> >
> >On Wed, Mar 6, 2013 at 1:55 PM, arun <smartpink...@yahoo.com> wrote:
> >
> >Hi,
> >>You can also try this:
> >> Tem3<- list()
> >> for(i in unique(Tem1[,1])) {
> >> Tem3[[i]]<- subset(Tem1,Tem1[,1]==i)
> >> Tem4<- do.call(rbind,Tem3)
> >> }
> >>head(Tem4)
> >>#      V1 V2
> >>#[1,] 111  1
> >>#[2,] 111  2
> >>#[3,] 111  3
> >>#[4,] 111  4
> >>#[5,] 111 13
> >>#[6,] 111 14
> >>
> >>
> >>#or
> >>Tem3<-c(NA,NA)
> >> for(i in unique(Tem1[,1])) {
> >> Tem2<- subset(Tem1, Tem1[,1]==i)
> >> Tem3<- rbind(Tem3,Tem2)
> >> Tem5<- Tem3[-1,]
> >> }
> >>head(Tem5)
> >>#  V1 V2
> >># 111  1
> >># 111  2
> >># 111  3
> >># 111  4
> >># 111 13
> >># 111 14
> >>
> >>A.K.
> >>
> >>
> >>________________________________
> >>From: HJ YAN <yhj...@googlemail.com>
> >>
> >>To: arun <smartpink...@yahoo.com>
> >>Cc: r-help@r-project.org
> >>Sent: Wednesday, March 6, 2013 8:24 AM
> >>Subject: Re: [R] How to combine conditional argument and logical
> argument in R to create subset of data...
> >>
> >>
> >>
> >>Hi Arun
> >>
> >>
> >>Thank you so much for the help, that's really helpful!!
> >>
> >>Also I have a quick question about the code below where I can not see
> why it doesn't work...
> >>
> >>I know the I shou
> >>
> >>V1<-c(rep(111,4),rep(222,4),rep(333,4),rep(111,4),rep(222,4),rep(333,3))
> >>V2<-c(1:23)
> >>Tem1<-cbind(V1,V2)
> >>
> >>
> >>So Tem 1 looks like...
> >>> Tem1
> >>       V1 V2
> >> [1,] 111  1
> >> [2,] 111  2
> >> [3,] 111  3
> >> [4,] 111  4
> >> [5,] 222  5
> >> [6,] 222  6
> >> [7,] 222  7
> >> [8,] 222  8
> >> [9,] 333  9
> >>[10,] 333 10
> >>[11,] 333 11
> >>[12,] 333 12
> >>[13,] 111 13
> >>[14,] 111 14
> >>[15,] 111 15
> >>[16,] 111 16
> >>[17,] 222 17
> >>[18,] 222 18
> >>[19,] 222 19
> >>[20,] 222 20
> >>[21,] 333 21
> >>[22,] 333 22
> >>[23,] 333 23
> >>
> >>I would like the outcome to be...
> >>
> >>      V1 V2
> >>
> >>     111  1
> >>     111  2
> >>     111  3
> >>     111  4
> >>     111 13
> >>     111 14
> >>     111 15
> >>     111 16
> >>     222  5
> >>     222  6
> >>     222  7
> >>     222  8
> >>     222 17
> >>     222 18
> >>     222 19
> >>     222 20
> >>     333  9
> >>     333 10
> >>     333 11
> >>     333 12
> >>     333 21
> >>     333 22
> >>     333 23
> >>
> >>
> >>So I tried code as below
> >>------------------------------------------
> >>Tem3<-c(NA,NA)
> >>for(i in length(unique(Tem1[,1]))){
> >>Tem2<-subset(Tem1,Tem1[,1]==unique(Tem1[,1])[i])
> >>Tem3<-rbind(Tem3,Tem2)
> >>Tem3
> >>}
> >>Tem4<-Tem3[-1,]
> >>---------------------------------------
> >>
> >>And only get this...
> >>
> >>
> >> V1 V2
> >> 333  9
> >> 333 10
> >> 333 11
> >> 333 12
> >> 333 21
> >> 333 22
> >> 333 23
> >>
> >>
> >>I tried to run the code step by step, e.g. letting i=1, then i=2, then
> i= 3, and updating my Tem3, I did get what I wanted, but wondered why in
> the loop above it did not work...??
> >>
> >>
> >>Many thanks in advance!
> >>
> >>HJ
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Wed, Mar 6, 2013 at 4:36 AM, arun <smartpink...@yahoo.com> wrote:
> >>
> >>Hi,
> >>>
> >>> b[b[,4]>15 & (b[,1]>4|is.na(b[,1])) & (b[,2]>4|is.na(b[,2])),]
> >>> #    [,1] [,2] [,3] [,4] [,5]
> >>>#[1,]    6   NA   NA   16   20
> >>>#[2,]   NA    5   NA   17   21
> >>>A.K.
> >>>
> >>>
> >>>
> >>>----- Original Message -----
> >>>From: HJ YAN <yhj...@googlemail.com>
> >>>To: r-help@r-project.org
> >>>Cc:
> >>>Sent: Tuesday, March 5, 2013 9:33 PM
> >>>Subject: [R] How to combine conditional argument and logical argument
> in R to create subset of data...
> >>>
> >>>Dear R user
> >>>
> >>>I have data created using code below
> >>>
> >>>b<-matrix(2:21,nrow=4)
> >>>b[,1:3]=NA
> >>>b[4,2]=5
> >>>b[3,1]=6
> >>>
> >>>Now the data is
> >>>
> >>>> b
> >>>         [,1]  [,2]   [,3]  [,4]  [,5]
> >>>[1,]   NA   NA   NA   14   18
> >>>[2,]   NA   NA   NA   15   19
> >>>[3,]      6   NA   NA   16   20
> >>>[4,]   NA    5     NA    17   21
> >>>
> >>>
> >>>I want to keep data in column 4 greater than 15 and the value in column
> 1 &
> >>>2 either greater than 4 or is 'NA'. So I would like to have
> >>>my outcome as below...
> >>>
> >>>[3,]   6   NA NA 16 20
> >>>[4,] NA 5 NA 17 21
> >>>
> >>>I thought something like the code below gonna to work but it only
> returns
> >>>the last row,e.g "NA 5 NA 17 21". ...
> >>>
> >>>bb<-b[which( (b[,2]>4 | b[,2]==NA) & (b[,1]>4 | b[,1]==NA) & b[,4]>15)
> ,])
> >>>
> >>>
> >>>Please could anyone help?
> >>>
> >>>Many thanks in advance
> >>>
> >>>HJ
> >>>
> >>>    [[alternative HTML version deleted]]
> >>>
> >>>______________________________________________
> >>>R-help@r-project.org mailing list
> >>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>>and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >>
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> >
> >
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to