Re: [R] Doubt in simple merge

2014-01-17 Thread Marc Schwartz

On Jan 16, 2014, at 11:14 PM, kingsly  wrote:

> Thank you dear friends.  You have cleared my first doubt.  
> 
> My second doubt:
> I have the same data sets "Elder" and "Younger". Elder <- data.frame(
>   ID=c("ID1","ID2","ID3"),
>   age=c(38,35,31))
> Younger <- data.frame(
>   ID=c("ID4","ID5","ID3"),
>   age=c(29,21,"NA"))
> 
> 
>  Row ID3 comes in both data set. It has a value (31) in "Elder" while "NA" in 
> "Younger".
> 
> I need output like this.
> 
> IDage
> ID1  38
> ID2  35
> ID3  31
> ID4  29
> ID5  21 
> 
> Kindly help me.


First, there is a problem with the way in which you created Younger, where you 
have the NA as "NA", which is a character and coerces the entire column to a 
factor, rather than a numeric:

> str(Younger)
'data.frame':   3 obs. of  2 variables:
 $ ID : Factor w/ 3 levels "ID3","ID4","ID5": 2 3 1
 $ age: Factor w/ 3 levels "21","29","NA": 2 1 3

It then causes problems in the default merge():

DF <- merge(Elder, Younger, by = c("ID", "age"), all = TRUE)

> str(DF)
'data.frame':   6 obs. of  2 variables:
 $ ID : Factor w/ 5 levels "ID1","ID2","ID3",..: 1 2 3 3 4 5
 $ age: chr  "38" "35" "31" "NA" ...


Note that 'age' becomes a character vector, again rather than numeric.

Thus:

Younger <- data.frame(ID = c("ID4", "ID5", "ID3"), age = c(29, 21, NA))

Now, when you merge as before, you get:

> str(merge(Elder, Younger, by = c("ID", "age"), all = TRUE))
'data.frame':   6 obs. of  2 variables:
 $ ID : Factor w/ 5 levels "ID1","ID2","ID3",..: 1 2 3 3 4 5
 $ age: num  38 35 31 NA 29 21


> merge(Elder, Younger, by = c("ID", "age"), all = TRUE)
   ID age
1 ID1  38
2 ID2  35
3 ID3  31
4 ID3  NA
5 ID4  29
6 ID5  21


Presuming that you want to consistently remove any NA values that may arise 
from either data frame:

> na.omit(merge(Elder, Younger, by = c("ID", "age"), all = TRUE))
   ID age
1 ID1  38
2 ID2  35
3 ID3  31
5 ID4  29
6 ID5  21


See ?na.omit

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt in simple merge

2014-01-17 Thread PIKAL Petr
Hi

are you really sure that age in Younger is factor? If it was numeric you can 
post process result of merge to get rid of NA

Younger$age<-as.numeric(as.character(Younger$age))
Warning message:
NAs introduced by coercion
komplet<-merge(Elder, Younger, all=T)
komplet[complete.cases(komplet),]
   ID age
1 ID1  38
2 ID2  35
3 ID3  31
5 ID4  29
6 ID5  21

If it **is** a factor like in your example

> str(Younger)
'data.frame':   3 obs. of  2 variables:
 $ ID : Factor w/ 3 levels "ID3","ID4","ID5": 2 3 1
 $ age: Factor w/ 3 levels "21","29","NA": 2 1 3

you can do it in similar way but not so easy.
Petr


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of kingsly
> Sent: Friday, January 17, 2014 6:14 AM
> To: r-help@r-project.org
> Subject: Re: [R] Doubt in simple merge
>
> Thank you dear friends.  You have cleared my first doubt.
>
> My second doubt:
> I have the same data sets "Elder" and "Younger". Elder <- data.frame(
>   ID=c("ID1","ID2","ID3"),
>   age=c(38,35,31))
> Younger <- data.frame(
>   ID=c("ID4","ID5","ID3"),
>   age=c(29,21,"NA"))
>
>
>  Row ID3 comes in both data set. It has a value (31) in "Elder" while
> "NA" in "Younger".
>
> I need output like this.
>
> IDage
> ID1  38
> ID2  35
> ID3  31
> ID4  29
> ID5  21
>
> Kindly help me.
>
>
>
> On Thursday, 16 January 2014 9:16 PM, Marc Schwartz-3 [via R]  node+s789695n4683682...@n4.nabble.com> wrote:
>
> Not quite:
>
> > rbind(Elder, Younger)
>ID age
> 1 ID1  38
> 2 ID2  35
> 3 ID3  31
> 4 ID4  29
> 5 ID5  21
> 6 ID3  31
>
> Note that ID3 is duplicated.
>
>
> Should be:
>
> > merge(Elder, Younger, by = c("ID", "age"), all = TRUE)
>ID age
> 1 ID1  38
> 2 ID2  35
> 3 ID3  31
> 4 ID4  29
> 5 ID5  21
>
>
> He wants to do a join on both "ID" and "age" to avoid duplications of
> rows when the same ID and age occur in both data frames. If the same
> column names (eg "Var") appears in both data frames and are not part of
> the 'by' argument, you end up with Var.x and Var.y in the result.
>
> In the case of two occurrences of the same ID but two different ages,
> if that is possible, both rows would be added to the result using the
> above code.
>
> Regards,
>
> Marc Schwartz
>
>
> On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen <[hidden email]>
> wrote:
>
> 
>
> __
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> 
>
> If you reply to this email, your message will be added to the
> discussion below: http://r.789695.n4.nabble.com/Doubt-in-simple-merge-
> tp4683671p4683682.html
> To start a new topic under R help, email ml-
> node+s789695n78969...@n4.nabble.com
> To unsubscribe from R help, click here.
> NAML
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Doubt-in-
> simple-merge-tp4683671p4683718.html
> Sent from the R help mailing list archive at Nabble.com.
>   [[alternative HTML version deleted]]



Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocně

Re: [R] Doubt in simple merge

2014-01-16 Thread kingsly
Thank you dear friends.  You have cleared my first doubt.  

My second doubt:
I have the same data sets "Elder" and "Younger". Elder <- data.frame(
  ID=c("ID1","ID2","ID3"),
  age=c(38,35,31))
Younger <- data.frame(
  ID=c("ID4","ID5","ID3"),
  age=c(29,21,"NA"))


 Row ID3 comes in both data set. It has a value (31) in "Elder" while "NA" in 
"Younger".

I need output like this.

ID    age
ID1  38
ID2  35
ID3  31
ID4  29
ID5  21 

Kindly help me.



On Thursday, 16 January 2014 9:16 PM, Marc Schwartz-3 [via R] 
 wrote:
 
Not quite: 

> rbind(Elder, Younger) 
   ID age 
1 ID1  38 
2 ID2  35 
3 ID3  31 
4 ID4  29 
5 ID5  21 
6 ID3  31 

Note that ID3 is duplicated. 


Should be: 

> merge(Elder, Younger, by = c("ID", "age"), all = TRUE) 
   ID age 
1 ID1  38 
2 ID2  35 
3 ID3  31 
4 ID4  29 
5 ID5  21 


He wants to do a join on both "ID" and "age" to avoid duplications of rows when 
the same ID and age occur in both data frames. If the same column names (eg 
"Var") appears in both data frames and are not part of the 'by' argument, you 
end up with Var.x and Var.y in the result. 

In the case of two occurrences of the same ID but two different ages, if that 
is possible, both rows would be added to the result using the above code. 

Regards, 

Marc Schwartz 


On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen <[hidden email]> wrote: 



__ 
[hidden email] mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 






If you reply to this email, your message will be added to the discussion below: 
http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671p4683682.html 
To start a new topic under R help, email ml-node+s789695n78969...@n4.nabble.com 
To unsubscribe from R help, click here.
NAML



--
View this message in context: 
http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671p4683718.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt in simple merge

2014-01-16 Thread Marc Schwartz
Not quite:

> rbind(Elder, Younger)
   ID age
1 ID1  38
2 ID2  35
3 ID3  31
4 ID4  29
5 ID5  21
6 ID3  31

Note that ID3 is duplicated.


Should be:

> merge(Elder, Younger, by = c("ID", "age"), all = TRUE)
   ID age
1 ID1  38
2 ID2  35
3 ID3  31
4 ID4  29
5 ID5  21


He wants to do a join on both "ID" and "age" to avoid duplications of rows when 
the same ID and age occur in both data frames. If the same column names (eg 
"Var") appears in both data frames and are not part of the 'by' argument, you 
end up with Var.x and Var.y in the result.

In the case of two occurrences of the same ID but two different ages, if that 
is possible, both rows would be added to the result using the above code.

Regards,

Marc Schwartz


On Jan 16, 2014, at 9:04 AM, Frede Aakmann Tøgersen  wrote:

> Ups, sorry that should have been
> 
> mer <- rbind(Elder, Younger)
> 
> /frede
> 
> 
>  Oprindelig meddelelse 
> Fra: Frede Aakmann Tøgersen
> Dato:16/01/2014 15.54 (GMT+01:00)
> Til: "Adams, Jean" ,kingsly
> Cc: R help
> Emne: Re: [R] Doubt in simple merge
> 
> No I think the OP wants
> 
> mer <- merge(Elder, Younger)
> 
> Br. Frede
> 
> 
> ---- Oprindelig meddelelse 
> Fra: "Adams, Jean"
> Dato:16/01/2014 15.45 (GMT+01:00)
> Til: kingsly
> Cc: R help
> Emne: Re: [R] Doubt in simple merge
> 
> You are telling it to merge by ID only.  But it sounds like you would like
> it to merge by both ID and age.
> 
> merge(Elder, Younger, all=TRUE)
> 
> Jean
> 
> 
> On Thu, Jan 16, 2014 at 6:25 AM, kingsly  wrote:
> 
>> Dear R community
>> 
>> I have a two data set called "Elder" and "Younger".
>> This is my code for simple merge.
>> 
>> Elder <- data.frame(
>>  ID=c("ID1","ID2","ID3"),
>>  age=c(38,35,31))
>> Younger <- data.frame(
>>  ID=c("ID4","ID5","ID3"),
>>  age=c(29,21,31))
>> 
>> mer <- merge(Elder,Younger,by="ID", all=T)
>> 
>> Output I am expecting:
>> 
>> IDage
>> ID1  38
>> ID2  35
>> ID3  31
>> ID4  29
>> ID5  21
>> 
>> It looks very simple.  But I need help.
>> When I run the code it gives me age.x and age.y.
>> thank you

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt in simple merge

2014-01-16 Thread Frede Aakmann Tøgersen
Ups, sorry that should have been

mer <- rbind(Elder, Younger)

/frede


 Oprindelig meddelelse 
Fra: Frede Aakmann Tøgersen
Dato:16/01/2014 15.54 (GMT+01:00)
Til: "Adams, Jean" ,kingsly
Cc: R help
Emne: Re: [R] Doubt in simple merge

No I think the OP wants

mer <- merge(Elder, Younger)

Br. Frede


 Oprindelig meddelelse 
Fra: "Adams, Jean"
Dato:16/01/2014 15.45 (GMT+01:00)
Til: kingsly
Cc: R help
Emne: Re: [R] Doubt in simple merge

You are telling it to merge by ID only.  But it sounds like you would like
it to merge by both ID and age.

merge(Elder, Younger, all=TRUE)

Jean


On Thu, Jan 16, 2014 at 6:25 AM, kingsly  wrote:

> Dear R community
>
> I have a two data set called "Elder" and "Younger".
> This is my code for simple merge.
>
> Elder <- data.frame(
>   ID=c("ID1","ID2","ID3"),
>   age=c(38,35,31))
> Younger <- data.frame(
>   ID=c("ID4","ID5","ID3"),
>   age=c(29,21,31))
>
> mer <- merge(Elder,Younger,by="ID", all=T)
>
> Output I am expecting:
>
> IDage
> ID1  38
> ID2  35
> ID3  31
> ID4  29
> ID5  21
>
> It looks very simple.  But I need help.
> When I run the code it gives me age.x and age.y.
> thank you
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt in simple merge

2014-01-16 Thread Frede Aakmann Tøgersen
No I think the OP wants

mer <- merge(Elder, Younger)

Br. Frede


 Oprindelig meddelelse 
Fra: "Adams, Jean"
Dato:16/01/2014 15.45 (GMT+01:00)
Til: kingsly
Cc: R help
Emne: Re: [R] Doubt in simple merge

You are telling it to merge by ID only.  But it sounds like you would like
it to merge by both ID and age.

merge(Elder, Younger, all=TRUE)

Jean


On Thu, Jan 16, 2014 at 6:25 AM, kingsly  wrote:

> Dear R community
>
> I have a two data set called "Elder" and "Younger".
> This is my code for simple merge.
>
> Elder <- data.frame(
>   ID=c("ID1","ID2","ID3"),
>   age=c(38,35,31))
> Younger <- data.frame(
>   ID=c("ID4","ID5","ID3"),
>   age=c(29,21,31))
>
> mer <- merge(Elder,Younger,by="ID", all=T)
>
> Output I am expecting:
>
> IDage
> ID1  38
> ID2  35
> ID3  31
> ID4  29
> ID5  21
>
> It looks very simple.  But I need help.
> When I run the code it gives me age.x and age.y.
> thank you
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Doubt in simple merge

2014-01-16 Thread Adams, Jean
You are telling it to merge by ID only.  But it sounds like you would like
it to merge by both ID and age.

merge(Elder, Younger, all=TRUE)

Jean


On Thu, Jan 16, 2014 at 6:25 AM, kingsly  wrote:

> Dear R community
>
> I have a two data set called "Elder" and "Younger".
> This is my code for simple merge.
>
> Elder <- data.frame(
>   ID=c("ID1","ID2","ID3"),
>   age=c(38,35,31))
> Younger <- data.frame(
>   ID=c("ID4","ID5","ID3"),
>   age=c(29,21,31))
>
> mer <- merge(Elder,Younger,by="ID", all=T)
>
> Output I am expecting:
>
> IDage
> ID1  38
> ID2  35
> ID3  31
> ID4  29
> ID5  21
>
> It looks very simple.  But I need help.
> When I run the code it gives me age.x and age.y.
> thank you
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Doubt-in-simple-merge-tp4683671.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.