Re: [R] Merge two data frames on date field
Hi,You haven't provided any example dataset. set.seed(42) dat1 - data.frame(dates=seq(as.POSIXct(2009-01-01 00:00:00,format=%Y-%m-%d %H:%M:%S),by= '12 hour', length=12),Field1=rnorm(12),Field2=LETTERS[1:12]) set.seed(395) dat2 - data.frame(dates=seq(as.POSIXct(2009-01-01 00:00:00,format=%Y-%m-%d %H:%M:%S),by= '6 hour', length=12),Field1=rnorm(12),Field2=sample(LETTERS,12,replace=FALSE)) merge(dat1,dat2,by=dates) # dates Field1.x Field2.x Field1.y Field2.y #1 2009-01-01 00:00:00 1.3709584 A -1.2152900 V #2 2009-01-01 12:00:00 -0.5646982 B -1.2771657 P #3 2009-01-02 00:00:00 0.3631284 C 0.4829300 W #4 2009-01-02 12:00:00 0.6328626 D 0.5468625 Y #5 2009-01-03 00:00:00 0.4042683 E -1.7113256 I #6 2009-01-03 12:00:00 -0.1061245 F 0.4851851 B merge(dat1,dat2,by=dates,all=TRUE)[1:5,] dates Field1.x Field2.x Field1.y Field2.y 1 2009-01-01 00:00:00 1.3709584 A -1.2152900 V 2 2009-01-01 06:00:00 NA NA -0.4633211 G 3 2009-01-01 12:00:00 -0.5646982 B -1.2771657 P 4 2009-01-01 18:00:00 NA NA 0.6011199 N 5 2009-01-02 00:00:00 0.3631284 C 0.4829300 W A.K. why wont two tables merge correctly on a date? I have one table with a field called 'date' 2009-01-01 00:00:00. The other table also has a field call 'date'. The dates are something like 2009-01-01 01:00:00. The two tables are not merging correctly. The dates are not lining up. NA The table also has other fields with dates in them. Sorry the data set is very big so I have not loaded it. The code I'm using to merge is m2 - merge(m1 ,mt,all=TRUE) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge two data frames and find common values and non-matching values
Yes, your code did exactly what I needed. Thank you!! -f [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge two data frames and find common values and non-matching values
Sorry---I thought it worked but I think I am actually definitely doing something wrong... The problem might be that there are NA's and there are also duplicated values...My fault. I can't figure out what is going wrong... I'll be more thorough and modify the two df to mirror more what I have to explain better: df1 is: Name Position location francesca A 75 maria A 75 cristina B 36 And df2 is: location Country 75 UK 75 Italy 56 France 56 Austria So I thought I had to first eliminate the duplicates like this: df1_unique-subset(df1, !duplicated(location)) df2_unique-subset(df2, !duplicated(location)) After doing this I get: df1 : Name Position location francesca A 75 cristina B 36 And df2: location Country 75 UK 56 France And I would like to match on Location and the output to tell me which records are matching in df1 and not in df2, the ones matching in both, and the ones which are in df2 but are not matching in df1... Name Position Location Match francesca A 75 1 cristina B 36 0 As William suggested, df12 - merge(df1, cbind(df2, fromDF2=TRUE), all.x=TRUE, by=location) df12$Match - !is.na(df12$fromDF2) new_common- new[which(new$Match==TRUE),] Would give me the records that are matching, which should be correct, but I am not getting the correct value for the non-shared elements (the variants that are in the df2 but not indf1): df2_only - subset(df1_unique, !(location %in% df2_unique)) df2_only- df2_unique[-which(df2_unique$location %in% df1_unique$location),] Neither of these work and give me wrong records... My questions are: 1. How do I calculate the records from df2 which are NOT in df1? 2.Do I need to eliminate the duplictaes (or is there a way to record where they came from)? Any help is very appreciated... THANK YOU very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merge two data frames and find common values and non-matching values
Hi, I am trying to find a function to match two data frames of different lengths for one field only. So, for example, df1 is: Name Position location francesca A 75 cristina B 36 And df2 is: location Country 75 UK 56 Austria And I would like to match on Location and the output to be something like: Name Position Location Match francesca A 75 1 cristina B 36 0 I have tried with the function 'match' or with: subset(df1, location %in% df2) But it does not work. Could you please help me figure out how to do this? Thank you! -f -- View this message in context: http://r.789695.n4.nabble.com/Merge-two-data-frames-and-find-common-values-and-non-matching-values-tp3868299p3868299.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge two data frames and find common values and non-matching values
Start out with merge(): df - merge(df1, df2, all.x=TRUE) # could add by=location for emphasis df location Name Position Country 1 36 cristinaBNA 2 75 francescaA UK You could make make your 'Match' column from is.na(df$Country) if you knew that df2$Country were never NA. Otherwise you can add a fake variable to the merge to tell which output rows come from unmatched rows in the first data.frame: df12 - merge(df1, cbind(df2, fromDF2=TRUE), all.x=TRUE, by=location) df12$Match - !is.na(df12$fromDF2) df12 location Name Position Country fromDF2 Match 1 36 cristinaBNA NA FALSE 2 75 francescaA UKTRUE TRUE Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of francy Sent: Monday, October 03, 2011 10:55 AM To: r-help@r-project.org Subject: [R] Merge two data frames and find common values and non-matching values Hi, I am trying to find a function to match two data frames of different lengths for one field only. So, for example, df1 is: Name Position location francesca A 75 cristina B 36 And df2 is: location Country 75 UK 56 Austria And I would like to match on Location and the output to be something like: Name Position Location Match francesca A 75 1 cristina B 36 0 I have tried with the function 'match' or with: subset(df1, location %in% df2) But it does not work. Could you please help me figure out how to do this? Thank you! -f -- View this message in context: http://r.789695.n4.nabble.com/Merge-two-data-frames-and-find-common- values-and-non-matching-values-tp3868299p3868299.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge two data frames and find common values and non-matching values
Hi, On Mon, Oct 3, 2011 at 1:54 PM, francy francy.casal...@gmail.com wrote: Hi, I am trying to find a function to match two data frames of different lengths for one field only. So, for example, df1 is: Name Position location francesca A 75 cristina B 36 And df2 is: location Country 75 UK 56 Austria And I would like to match on Location and the output to be something like: Sounds like you need merge() (just as in your subject line!). Name Position Location Match francesca A 75 1 cristina B 36 0 I have tried with the function 'match' or with: subset(df1, location %in% df2) But it does not work. Could you please help me figure out how to do this? Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge two data frames
Dear list, here are my two data frames: av - structure(list(DESCRIPTION = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), prix = c(-168.3, -1.62, -773.75, -78.75, -1168.3, -0.0916, -470.75, 1758.5, -975.25, 1964, -19.09, -605.75), pos = c(-1, 0, -2, -1, -1, 0, -1, 1, -1, 1, -1, -1), PL = c(-12.03, -31.68, -43.2, -70.49, -11.88, -95.04, -3.96, -35.64, -30.24, -12.5, -36.09, -4.32)), .Names = c(DESCRIPTION, prix, pos, PL), row.names = c(NA, 12L), class = data.frame) zz - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 13L, 11L, 12L), .Label = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, PRM HGH GD ALUMINIUM USD, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), class = factor), pl = c(-8.22, 2.31, -47.25, -0.234, -12.7, -0.236, -2, 11.71000, 14.40001, -34.75, -10.75, 55, -0.668), PL = c(3075.001, -575.0003, 2362.5, 115.0002, 1270.000, 2360, 200, -292.7501, -720.0005, 1737.5, 537.5, -1375, 750.3998), POSITION = c(1, -1, 2, 2, 1, 2, -1, -1, 1, 2, 0, 0, 0), SETTLEMENT = c(167.4, 324.55, 390.75, 76.99, 1160.4, 4.718, 468.75, 2067.71, 1744.1, 978, 0, 0, 0)), .Names = c(DESCRIPTION, pl, PL, POSITION, SETTLEMENT), row.names = c(NA, -13L), class = data.frame) I am looking for one data frame with the column $PL=zz$PL+av$PL. I have been trying using the merge() function and its different arguments with no sucess. Any help is appreciated. Thank You. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge two data frames
TY Petr, it works. I will then replace NA by 0. 2010/8/4 Petr PIKAL petr.pi...@precheza.cz: Hi you tried OK result - merge(zz, av, by=DESCRIPTION, all=TRUE) and as you did not specify what to do when one value is NA here is one possible solution rowSums(cbind(result$PL.x, result$PL.y), na.rm=T) Regards Petr r-help-boun...@r-project.org napsal dne 04.08.2010 11:52:00: Dear list, here are my two data frames: av - structure(list(DESCRIPTION = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), prix = c(-168.3, -1.62, -773.75, -78.75, -1168.3, -0.0916, -470.75, 1758.5, -975.25, 1964, -19.09, -605.75), pos = c(-1, 0, -2, -1, -1, 0, -1, 1, -1, 1, -1, -1), PL = c(-12.03, -31.68, -43.2, -70.49, -11.88, -95.04, -3.96, -35.64, -30.24, -12.5, -36.09, -4.32)), .Names = c(DESCRIPTION, prix, pos, PL), row.names = c(NA, 12L), class = data.frame) zz - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 13L, 11L, 12L), .Label = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, PRM HGH GD ALUMINIUM USD, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), class = factor), pl = c(-8.22, 2.31, -47.25, -0.234, -12.7, -0.236, -2, 11.71000, 14.40001, -34.75, -10.75, 55, -0.668), PL = c(3075.001, -575.0003, 2362.5, 115.0002, 1270.000, 2360, 200, -292.7501, -720.0005, 1737.5, 537.5, -1375, 750.3998), POSITION = c(1, -1, 2, 2, 1, 2, -1, -1, 1, 2, 0, 0, 0), SETTLEMENT = c(167.4, 324.55, 390.75, 76.99, 1160.4, 4.718, 468.75, 2067.71, 1744.1, 978, 0, 0, 0)), .Names = c(DESCRIPTION, pl, PL, POSITION, SETTLEMENT), row.names = c(NA, -13L), class = data.frame) I am looking for one data frame with the column $PL=zz$PL+av$PL. I have been trying using the merge() function and its different arguments with no sucess. Any help is appreciated. Thank You. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- ** Arnaud Gaboury Mobile: +41 79 392 79 56 BBM: 255B488F __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge two data frames
Hi Arnaud, It is slightly confusing what you are asking but if you just want a dataframe with the two zz$PL and av$PL columns concatenated then merge is not what you are after. Try something like this. t - rbind(data.frame(PL=zz$PL), data.frame(PL=av$PL)) t PL 1 3075.00 2 -575.00 3 2362.50 4115.00 5 1270.00 6 2360.00 7200.00 8 -292.75 9 -720.00 10 1737.50 11 537.50 12 -1375.00 13 750.40 14 -12.03 15 -31.68 16 -43.20 17 -70.49 18 -11.88 19 -95.04 20-3.96 21 -35.64 22 -30.24 23 -12.50 24 -36.09 25-4.32 However, if you are trying to JOIN the dataframes on the PL column then you can use merge to give you this intersection...though as there aren't any intersecting values on this column, I suspect this isn't what you really want. Anyway to use merge try this: zz.av.join - merge(zz, av, by.x=PL, by.y=PL) zz.av.join [1] PLDESCRIPTION.x plPOSITION SETTLEMENT DESCRIPTION.y prix [8] pos 0 rows (or 0-length row.names) -- Dear list, here are my two data frames: av - structure(list(DESCRIPTION = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), prix = c(-168.3, -1.62, -773.75, -78.75, -1168.3, -0.0916, -470.75, 1758.5, -975.25, 1964, -19.09, -605.75), pos = c(-1, 0, -2, -1, -1, 0, -1, 1, -1, 1, -1, -1), PL = c(-12.03, -31.68, -43.2, -70.49, -11.88, -95.04, -3.96, -35.64, -30.24, -12.5, -36.09, -4.32)), .Names = c(DESCRIPTION, prix, pos, PL), row.names = c(NA, 12L), class = data.frame) zz - structure(list(DESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 13L, 11L, 12L), .Label = c(COFFEE C Sep/10, COPPER Sep/10, CORN Dec/10, CRUDE OIL miNY Sep/10, GOLD Aug/10, HENRY HUB NATURAL GAS Sep/10, PALLADIUM Sep/10, PRM HGH GD ALUMINIUM USD, SILVER Sep/10, SOYBEANS Nov/10, SPCL HIGH GRADE ZINC USD, SUGAR NO.11 Oct/10, WHEAT Sep/10), class = factor), pl = c(-8.22, 2.31, -47.25, -0.234, -12.7, -0.236, -2, 11.71000, 14.40001, -34.75, -10.75, 55, -0.668), PL = c(3075.001, -575.0003, 2362.5, 115.0002, 1270.000, 2360, 200, -292.7501, -720.0005, 1737.5, 537.5, -1375, 750.3998), POSITION = c(1, -1, 2, 2, 1, 2, -1, -1, 1, 2, 0, 0, 0), SETTLEMENT = c(167.4, 324.55, 390.75, 76.99, 1160.4, 4.718, 468.75, 2067.71, 1744.1, 978, 0, 0, 0)), .Names = c(DESCRIPTION, pl, PL, POSITION, SETTLEMENT), row.names = c(NA, -13L), class = data.frame) I am looking for one data frame with the column $PL=zz$PL+av$PL. I have been trying using the merge() function and its different arguments with no sucess. Any help is appreciated. Thank You. Amos Folarin (BSc MRes PhD) Research Associate, Bioinformatics Group * UCL Cancer Institute Paul O'Gorman Building University College London 72 Huntley Street London WC1E 6BT Direct Line: +44 (0)207 679 0764 Email: a.fola...@cancer.ucl.ac.uk mailto:a.fola...@cancer.ucl.ac.uk * ** This email and any files transmitted with it are confide...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge two data frames
Dear group, I have these 2 following data frame: allcon - structure(list(DESCRIPTION = structure(1:17, .Label = c(COFFEE C Jul/10, COPPER May/10, CORN Jul/10, CORN May/10, COTTON NO.2 Jul/10, CRUDE OIL miNY May/10, GOLD Jun/10, HENRY HUB NATURAL GAS May/10, ROBUSTA COFFEE (10) Jul/10, SILVER May/10, SOYBEANS Jul/10, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11 Jul/10, SUGAR NO.11 May/10, WHEAT Jul/10, WHEAT May/10), class = factor), pl = c(4.750003, -14, -1.5, 7.75, 3.74, 5.225005, 21.69998, -0.04900013, 68, 45.300, -8.5, -41, -118, 2.72, -0.902, -64.25, 0), quantity = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), SHORTDESCRIPTION = c(COFFEE C, COPPER, CORN, CORN, COTTON NO.2, CRUDE OIL miNY, GOLD, HENRY HUB NATURAL GAS, ROBUSTA COFFEE (10), SILVER, SOYBEANS, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11, SUGAR NO.11, WHEAT, WHEAT)), .Names = c(DESCRIPTION, pl, quantity, SHORTDESCRIPTION), row.names = c(NA, -17L ), class = data.frame) value - structure(list(SHORTDESCRIPTION = structure(c(7L, 11L, 10L, 14L, 9L, 22L, 16L, 17L, 4L, 19L, 27L, 18L, 26L, 2L, 3L, 21L, 6L, 15L, 1L, 28L, 5L, 25L, 23L, 24L, 29L, 8L, 12L, 13L, 20L), .Label = c( SUGAR NO.11, COCOA, COFFEE C, COPPER, CORN, COTTON NO.2, CRUDE OIL miNY, FEEDER CATTLE, GOLD, HEATING OIL, HENRY HUB NATURAL GAS, LEAN HOG, LIVE CATTLE, NY HARBR GASOLINE BLDSTK, ORANGE J., PALLADIUM, PLATINUM, PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, RANDOM LENGTH LUMBER, ROBUSTA COFFEE (10), SILVER , SOYBEAN MEAL , SOYBEAN OIL, SOYBEANS , SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR WHITE, WHEAT), class = factor), VALUE = c(500L, 1L, 420L, 420L, 100L, 5000L, 100L, 50L, 250L, 25L, 25L, 6L, 25L, 10L, 375L, 10L, 500L, 150L, 1120L, 50L, 50L, 50L, 100L, 600L, 50L, 500L, 400L, 400L, 110L)), .Names = c(SHORTDESCRIPTION, VALUE), class = data.frame, row.names = c(NA, -29L)) I want to merge these df: y=merge(value,allcon,by.x=SHORTDESCRIPTION,all=T) Here is the result: y - structure(list(SHORTDESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 29L, 30L, 31L, 32L, 32L), .Label = c( SUGAR NO.11, COCOA, COFFEE C, COPPER, CORN, COTTON NO.2, CRUDE OIL miNY, FEEDER CATTLE, GOLD, HEATING OIL, HENRY HUB NATURAL GAS, LEAN HOG, LIVE CATTLE, NY HARBR GASOLINE BLDSTK, ORANGE J., PALLADIUM, PLATINUM, PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, RANDOM LENGTH LUMBER, ROBUSTA COFFEE (10), SILVER , SOYBEAN MEAL , SOYBEAN OIL, SOYBEANS , SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR WHITE, WHEAT, SILVER, SOYBEANS, SUGAR NO.11), class = factor), VALUE = c(1120L, 10L, 375L, 250L, 50L, 50L, 500L, 500L, 500L, 100L, 420L, 1L, 400L, 400L, 420L, 150L, 100L, 50L, 6L, 25L, 110L, 10L, 5000L, 100L, 600L, 50L, 25L, 25L, 50L, 50L, 50L, NA, NA, NA, NA), DESCRIPTION = structure(c(NA, NA, 1L, 2L, 4L, 3L, 5L, 6L, NA, 7L, NA, 8L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9L, NA, NA, NA, NA, 12L, 13L, NA, 16L, 17L, 10L, 11L, 14L, 15L), .Label = c(COFFEE C Jul/10, COPPER May/10, CORN Jul/10, CORN May/10, COTTON NO.2 Jul/10, CRUDE OIL miNY May/10, GOLD Jun/10, HENRY HUB NATURAL GAS May/10, ROBUSTA COFFEE (10) Jul/10, SILVER May/10, SOYBEANS Jul/10, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11 Jul/10, SUGAR NO.11 May/10, WHEAT Jul/10, WHEAT May/10), class = factor), pl = c(NA, NA, 4.750003, -14, 7.75, -1.5, 3.74, 5.225005, NA, 21.69998, NA, -0.04900013, NA, NA, NA, NA, NA, NA, NA, NA, NA, 68, NA, NA, NA, NA, -41, -118, NA, -64.25, 0, 45.300, -8.5, 2.72, -0.902 ), quantity = c(NA, NA, 0, 0, 0, 0, 0, 0, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, 0, 0, NA, 0, 0, 0, 0, 0, 0), PL = c(NA, NA, -1781.251, 3500, -387.5, 75, -1874.999, -2612.503, NA, -2169.998, NA, 490.0013, NA, NA, NA, NA, NA, NA, NA, NA, NA, -680, NA, NA, NA, NA, 1025, 2950, NA, 3212.5, 0, NA, NA, NA, NA)), .Names = c(SHORTDESCRIPTION, VALUE, DESCRIPTION, pl, quantity, PL), row.names = c(NA, -35L), class = data.frame) As you can see, the last 4 rows have a NA in y$VALUE. How can I instead get the correct VALUE ? TY for any help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merge two data frames
Got it! First, here is the correct line : y=merge(value,allcon,all.y=T) Then, I got NA because of some white space (blancs) in names of elements of my data frame VALUE.It came from the .csv file from where I obtained the VALUE df. TY Excel again! In fact, as a general advice, I would SYSTEMATICALLY add the argument strip.white=T when I read a csv file, so all blanks are removed. -Original Message- From: arnaud Gaboury [mailto:arnaud.gabo...@gmail.com] Sent: Tuesday, May 11, 2010 4:25 PM To: r-help@r-project.org Cc: 'arnaud Gaboury' Subject: merge two data frames Dear group, I have these 2 following data frame: allcon - structure(list(DESCRIPTION = structure(1:17, .Label = c(COFFEE C Jul/10, COPPER May/10, CORN Jul/10, CORN May/10, COTTON NO.2 Jul/10, CRUDE OIL miNY May/10, GOLD Jun/10, HENRY HUB NATURAL GAS May/10, ROBUSTA COFFEE (10) Jul/10, SILVER May/10, SOYBEANS Jul/10, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11 Jul/10, SUGAR NO.11 May/10, WHEAT Jul/10, WHEAT May/10), class = factor), pl = c(4.750003, -14, -1.5, 7.75, 3.74, 5.225005, 21.69998, -0.04900013, 68, 45.300, -8.5, -41, -118, 2.72, -0.902, -64.25, 0), quantity = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), SHORTDESCRIPTION = c(COFFEE C, COPPER, CORN, CORN, COTTON NO.2, CRUDE OIL miNY, GOLD, HENRY HUB NATURAL GAS, ROBUSTA COFFEE (10), SILVER, SOYBEANS, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11, SUGAR NO.11, WHEAT, WHEAT)), .Names = c(DESCRIPTION, pl, quantity, SHORTDESCRIPTION), row.names = c(NA, -17L ), class = data.frame) value - structure(list(SHORTDESCRIPTION = structure(c(7L, 11L, 10L, 14L, 9L, 22L, 16L, 17L, 4L, 19L, 27L, 18L, 26L, 2L, 3L, 21L, 6L, 15L, 1L, 28L, 5L, 25L, 23L, 24L, 29L, 8L, 12L, 13L, 20L), .Label = c( SUGAR NO.11, COCOA, COFFEE C, COPPER, CORN, COTTON NO.2, CRUDE OIL miNY, FEEDER CATTLE, GOLD, HEATING OIL, HENRY HUB NATURAL GAS, LEAN HOG, LIVE CATTLE, NY HARBR GASOLINE BLDSTK, ORANGE J., PALLADIUM, PLATINUM, PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, RANDOM LENGTH LUMBER, ROBUSTA COFFEE (10), SILVER , SOYBEAN MEAL , SOYBEAN OIL, SOYBEANS , SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR WHITE, WHEAT), class = factor), VALUE = c(500L, 1L, 420L, 420L, 100L, 5000L, 100L, 50L, 250L, 25L, 25L, 6L, 25L, 10L, 375L, 10L, 500L, 150L, 1120L, 50L, 50L, 50L, 100L, 600L, 50L, 500L, 400L, 400L, 110L)), .Names = c(SHORTDESCRIPTION, VALUE), class = data.frame, row.names = c(NA, -29L)) I want to merge these df: y=merge(value,allcon,by.x=SHORTDESCRIPTION,all=T) Here is the result: y - structure(list(SHORTDESCRIPTION = structure(c(1L, 2L, 3L, 4L, 5L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 29L, 30L, 31L, 32L, 32L), .Label = c( SUGAR NO.11, COCOA, COFFEE C, COPPER, CORN, COTTON NO.2, CRUDE OIL miNY, FEEDER CATTLE, GOLD, HEATING OIL, HENRY HUB NATURAL GAS, LEAN HOG, LIVE CATTLE, NY HARBR GASOLINE BLDSTK, ORANGE J., PALLADIUM, PLATINUM, PRIMARY NICKEL USD, PRM HGH GD ALUMINIUM USD, RANDOM LENGTH LUMBER, ROBUSTA COFFEE (10), SILVER , SOYBEAN MEAL , SOYBEAN OIL, SOYBEANS , SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR WHITE, WHEAT, SILVER, SOYBEANS, SUGAR NO.11), class = factor), VALUE = c(1120L, 10L, 375L, 250L, 50L, 50L, 500L, 500L, 500L, 100L, 420L, 1L, 400L, 400L, 420L, 150L, 100L, 50L, 6L, 25L, 110L, 10L, 5000L, 100L, 600L, 50L, 25L, 25L, 50L, 50L, 50L, NA, NA, NA, NA), DESCRIPTION = structure(c(NA, NA, 1L, 2L, 4L, 3L, 5L, 6L, NA, 7L, NA, 8L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 9L, NA, NA, NA, NA, 12L, 13L, NA, 16L, 17L, 10L, 11L, 14L, 15L), .Label = c(COFFEE C Jul/10, COPPER May/10, CORN Jul/10, CORN May/10, COTTON NO.2 Jul/10, CRUDE OIL miNY May/10, GOLD Jun/10, HENRY HUB NATURAL GAS May/10, ROBUSTA COFFEE (10) Jul/10, SILVER May/10, SOYBEANS Jul/10, SPCL HIGH GRADE ZINC USD, STANDARD LEAD USD, SUGAR NO.11 Jul/10, SUGAR NO.11 May/10, WHEAT Jul/10, WHEAT May/10), class = factor), pl = c(NA, NA, 4.750003, -14, 7.75, -1.5, 3.74, 5.225005, NA, 21.69998, NA, -0.04900013, NA, NA, NA, NA, NA, NA, NA, NA, NA, 68, NA, NA, NA, NA, -41, -118, NA, -64.25, 0, 45.300, -8.5, 2.72, - 0.902 ), quantity = c(NA, NA, 0, 0, 0, 0, 0, 0, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, 0, 0, NA, 0, 0, 0, 0, 0, 0), PL = c(NA, NA, -1781.251, 3500, -387.5, 75, -1874.999, -2612.503, NA, -2169.998, NA, 490.0013, NA, NA, NA, NA, NA, NA, NA, NA, NA, -680, NA, NA, NA, NA, 1025, 2950, NA, 3212.5, 0, NA, NA, NA, NA)), .Names = c(SHORTDESCRIPTION, VALUE, DESCRIPTION, pl, quantity, PL), row.names