Re: [R] Doubt in simple merge
On Jan 16, 2014, at 11:14 PM, kingsly wrote: > Thank you dear friends. You have cleared my first doubt. > > My second doubt: > I have the same data sets "Elder" and "Younger". Elder <- data.frame( > ID=c("ID1","ID2","ID3"), > age=c(38,35,31)) > Younger <- data.frame( > ID=c("ID4","ID5","ID3"), > age=c(29,21,"NA")) > > > Row ID3 comes in both data set. It has a value (31) in "Elder" while "NA" in > "Younger". > > I need output like this. > > IDage > ID1 38 > ID2 35 > ID3 31 > ID4 29 > ID5 21 > > Kindly help me. First, there is a problem with the way in which you created Younger, where you have the NA as "NA", which is a character and coerces the entire column to a factor, rather than a numeric: > str(Younger) 'data.frame': 3 obs. of 2 variables: $ ID : Factor w/ 3 levels "ID3","ID4","ID5": 2 3 1 $ age: Factor w/ 3 levels "21","29","NA": 2 1 3 It then causes problems in the default merge(): DF <- merge(Elder, Younger, by = c("ID", "age"), all = TRUE) > str(DF) 'data.frame': 6 obs. of 2 variables: $ ID : Factor w/ 5 levels "ID1","ID2","ID3",..: 1 2 3 3 4 5 $ age: chr "38" "35" "31" "NA" ... Note that 'age' becomes a character vector, again rather than numeric. Thus: Younger <- data.frame(ID = c("ID4", "ID5", "ID3"), age = c(29, 21, NA)) Now, when you merge as before, you get: > str(merge(Elder, Younger, by = c("ID", "age"), all = TRUE)) 'data.frame': 6 obs. of 2 variables: $ ID : Factor w/ 5 levels "ID1","ID2","ID3",..: 1 2 3 3 4 5 $ age: num 38 35 31 NA 29 21 > merge(Elder, Younger, by = c("ID", "age"), all = TRUE) ID age 1 ID1 38 2 ID2 35 3 ID3 31 4 ID3 NA 5 ID4 29 6 ID5 21 Presuming that you want to consistently remove any NA values that may arise from either data frame: > na.omit(merge(Elder, Younger, by = c("ID", "age"), all = TRUE)) ID age 1 ID1 38 2 ID2 35 3 ID3 31 5 ID4 29 6 ID5 21 See ?na.omit Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt in simple merge
Hi are you really sure that age in Younger is factor? If it was numeric you can post process result of merge to get rid of NA Younger$age<-as.numeric(as.character(Younger$age)) Warning message: NAs introduced by coercion komplet<-merge(Elder, Younger, all=T) komplet[complete.cases(komplet),] ID age 1 ID1 38 2 ID2 35 3 ID3 31 5 ID4 29 6 ID5 21 If it **is** a factor like in your example > str(Younger) 'data.frame': 3 obs. of 2 variables: $ ID : Factor w/ 3 levels "ID3","ID4","ID5": 2 3 1 $ age: Factor w/ 3 levels "21","29","NA": 2 1 3 you can do it in similar way but not so easy. Petr > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of kingsly > Sent: Friday, January 17, 2014 6:14 AM > To: r-help@r-project.org > Subject: Re: [R] Doubt in simple merge > > Thank you dear friends. You have cleared my first doubt. > > My second doubt: > I have the same data sets "Elder" and "Younger". Elder <- data.frame( > ID=c("ID1","ID2","ID3"), > age=c(38,35,31)) > Younger <- data.frame( > ID=c("ID4","ID5","ID3"), > age=c(29,21,"NA")) > > > Row ID3 comes in both data set. It has a value (31) in "Elder" while > "NA" in "Younger". > > I need output like this. > > IDage > ID1 38 > ID2 35 > ID3 31 > ID4 29 > ID5 21 > > Kindly help me. > > > > On Thursday, 16 January 2014 9:16 PM, Marc Schwartz-3 [via R] node+s789695n4683682...@n4.nabble.com> wrote: > > Not quite: > > > rbind(Elder, Younger) >ID age > 1 ID1 38 > 2 ID2 35 > 3 ID3 31 > 4 ID4 29 > 5 ID5 21 > 6 ID3 31 > > Note that ID3 is duplicated. > > > Should be: > > > merge(Elder, Younger, by = c("ID", "age"), all = TRUE) >ID age > 1 ID1 38 > 2 ID2 35 > 3 ID3 31 > 4 ID4 29 > 5 ID5 21 > > > He wants to do a join on both "ID" and "age" to avoid duplications of > rows when the same ID and age occur in both data frames. If the same > column names (eg "Var") appears in both data frames and are not part of > the 'by' argument, you end up with Var.x and Var.y in the result. > > In the case of two occurrences of the same ID but two different ages, > if that is possible, both rows would be added to the result using the > above code. > > Regards, > > Marc Schwartz > > > On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen <[hidden email]> > wrote: > > > > __ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > > > If you reply to this email, your message will be added to the > discussion below: http://r.789695.n4.nabble.com/Doubt-in-simple-merge- > tp4683671p4683682.html > To start a new topic under R help, email ml- > node+s789695n78969...@n4.nabble.com > To unsubscribe from R help, click here. > NAML > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Doubt-in- > simple-merge-tp4683671p4683718.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocně
Re: [R] Doubt in simple merge
Thank you dear friends. You have cleared my first doubt.  My second doubt: I have the same data sets "Elder" and "Younger". Elder <- data.frame(  ID=c("ID1","ID2","ID3"),  age=c(38,35,31)) Younger <- data.frame(  ID=c("ID4","ID5","ID3"),  age=c(29,21,"NA"))  Row ID3 comes in both data set. It has a value (31) in "Elder" while "NA" in "Younger". I need output like this. ID   age ID1 38 ID2 35 ID3 31 ID4 29 ID5 21 Kindly help me. On Thursday, 16 January 2014 9:16 PM, Marc Schwartz-3 [via R] wrote:  Not quite: > rbind(Elder, Younger)   ID age 1 ID1  38 2 ID2  35 3 ID3  31 4 ID4  29 5 ID5  21 6 ID3  31 Note that ID3 is duplicated. Should be: > merge(Elder, Younger, by = c("ID", "age"), all = TRUE)   ID age 1 ID1  38 2 ID2  35 3 ID3  31 4 ID4  29 5 ID5  21 He wants to do a join on both "ID" and "age" to avoid duplications of rows when the same ID and age occur in both data frames. If the same column names (eg "Var") appears in both data frames and are not part of the 'by' argument, you end up with Var.x and Var.y in the result. In the case of two occurrences of the same ID but two different ages, if that is possible, both rows would be added to the result using the above code. Regards, Marc Schwartz On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen <[hidden email]> wrote: __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671p4683682.html To start a new topic under R help, email ml-node+s789695n78969...@n4.nabble.com To unsubscribe from R help, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671p4683718.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt in simple merge
Not quite: > rbind(Elder, Younger) ID age 1 ID1 38 2 ID2 35 3 ID3 31 4 ID4 29 5 ID5 21 6 ID3 31 Note that ID3 is duplicated. Should be: > merge(Elder, Younger, by = c("ID", "age"), all = TRUE) ID age 1 ID1 38 2 ID2 35 3 ID3 31 4 ID4 29 5 ID5 21 He wants to do a join on both "ID" and "age" to avoid duplications of rows when the same ID and age occur in both data frames. If the same column names (eg "Var") appears in both data frames and are not part of the 'by' argument, you end up with Var.x and Var.y in the result. In the case of two occurrences of the same ID but two different ages, if that is possible, both rows would be added to the result using the above code. Regards, Marc Schwartz On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen wrote: > Ups, sorry that should have been > > mer <- rbind(Elder, Younger) > > /frede > > > Oprindelig meddelelse > Fra: Frede Aakmann Tøgersen > Dato:16/01/2014 15.54 (GMT+01:00) > Til: "Adams, Jean" ,kingsly > Cc: R help > Emne: Re: [R] Doubt in simple merge > > No I think the OP wants > > mer <- merge(Elder, Younger) > > Br. Frede > > > ---- Oprindelig meddelelse > Fra: "Adams, Jean" > Dato:16/01/2014 15.45 (GMT+01:00) > Til: kingsly > Cc: R help > Emne: Re: [R] Doubt in simple merge > > You are telling it to merge by ID only. But it sounds like you would like > it to merge by both ID and age. > > merge(Elder, Younger, all=TRUE) > > Jean > > > On Thu, Jan 16, 2014 at 6:25 AM, kingsly wrote: > >> Dear R community >> >> I have a two data set called "Elder" and "Younger". >> This is my code for simple merge. >> >> Elder <- data.frame( >> ID=c("ID1","ID2","ID3"), >> age=c(38,35,31)) >> Younger <- data.frame( >> ID=c("ID4","ID5","ID3"), >> age=c(29,21,31)) >> >> mer <- merge(Elder,Younger,by="ID", all=T) >> >> Output I am expecting: >> >> IDage >> ID1 38 >> ID2 35 >> ID3 31 >> ID4 29 >> ID5 21 >> >> It looks very simple. But I need help. >> When I run the code it gives me age.x and age.y. >> thank you __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt in simple merge
Ups, sorry that should have been mer <- rbind(Elder, Younger) /frede Oprindelig meddelelse Fra: Frede Aakmann Tøgersen Dato:16/01/2014 15.54 (GMT+01:00) Til: "Adams, Jean" ,kingsly Cc: R help Emne: Re: [R] Doubt in simple merge No I think the OP wants mer <- merge(Elder, Younger) Br. Frede Oprindelig meddelelse Fra: "Adams, Jean" Dato:16/01/2014 15.45 (GMT+01:00) Til: kingsly Cc: R help Emne: Re: [R] Doubt in simple merge You are telling it to merge by ID only. But it sounds like you would like it to merge by both ID and age. merge(Elder, Younger, all=TRUE) Jean On Thu, Jan 16, 2014 at 6:25 AM, kingsly wrote: > Dear R community > > I have a two data set called "Elder" and "Younger". > This is my code for simple merge. > > Elder <- data.frame( > ID=c("ID1","ID2","ID3"), > age=c(38,35,31)) > Younger <- data.frame( > ID=c("ID4","ID5","ID3"), > age=c(29,21,31)) > > mer <- merge(Elder,Younger,by="ID", all=T) > > Output I am expecting: > > IDage > ID1 38 > ID2 35 > ID3 31 > ID4 29 > ID5 21 > > It looks very simple. But I need help. > When I run the code it gives me age.x and age.y. > thank you > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt in simple merge
No I think the OP wants mer <- merge(Elder, Younger) Br. Frede Oprindelig meddelelse Fra: "Adams, Jean" Dato:16/01/2014 15.45 (GMT+01:00) Til: kingsly Cc: R help Emne: Re: [R] Doubt in simple merge You are telling it to merge by ID only. But it sounds like you would like it to merge by both ID and age. merge(Elder, Younger, all=TRUE) Jean On Thu, Jan 16, 2014 at 6:25 AM, kingsly wrote: > Dear R community > > I have a two data set called "Elder" and "Younger". > This is my code for simple merge. > > Elder <- data.frame( > ID=c("ID1","ID2","ID3"), > age=c(38,35,31)) > Younger <- data.frame( > ID=c("ID4","ID5","ID3"), > age=c(29,21,31)) > > mer <- merge(Elder,Younger,by="ID", all=T) > > Output I am expecting: > > IDage > ID1 38 > ID2 35 > ID3 31 > ID4 29 > ID5 21 > > It looks very simple. But I need help. > When I run the code it gives me age.x and age.y. > thank you > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doubt in simple merge
You are telling it to merge by ID only. But it sounds like you would like it to merge by both ID and age. merge(Elder, Younger, all=TRUE) Jean On Thu, Jan 16, 2014 at 6:25 AM, kingsly wrote: > Dear R community > > I have a two data set called "Elder" and "Younger". > This is my code for simple merge. > > Elder <- data.frame( > ID=c("ID1","ID2","ID3"), > age=c(38,35,31)) > Younger <- data.frame( > ID=c("ID4","ID5","ID3"), > age=c(29,21,31)) > > mer <- merge(Elder,Younger,by="ID", all=T) > > Output I am expecting: > > IDage > ID1 38 > ID2 35 > ID3 31 > ID4 29 > ID5 21 > > It looks very simple. But I need help. > When I run the code it gives me age.x and age.y. > thank you > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html > Sent from the R help mailing list archive at Nabble.com. > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.