The whole thing is a merge operation, i.e. > FruitNutr <- read.table(text=" + Fruit Calories + 1 banana 100 + 2 pear 100 + 3 mango 200 + ") > FruitData <- read.table(text=" + Fruit Color Shape Juice + 1 apple red round 1 + 2 banana yellow oblong 0 + 3 pear green pear 0.5 + 4 orange orange round 1 + 5 kiwi green round 0 + ") > merge(FruitData, FruitNutr) Fruit Color Shape Juice Calories 1 banana yellow oblong 0.0 100 2 pear green pear 0.5 100 > merge(FruitData, FruitNutr, all.x=TRUE) Fruit Color Shape Juice Calories 1 apple red round 1.0 NA 2 banana yellow oblong 0.0 100 3 kiwi green round 0.0 NA 4 orange orange round 1.0 NA 5 pear green pear 0.5 100
Mind you, merge() comes with its own set of confusing options in the more complex cases, which may be why the authors have chosen a more elementary approach. -pd > On 18 Apr 2019, at 01:24 , Drake Gossi <drake.go...@gmail.com> wrote: > > Hello everyone, > > I'm working through this book, *Humanities Data in R* (Arnold & Tilton), > and I'm just having trouble understanding this maneuver. > > In sum, I'm trying to combine data in two different data.frames. > > This data.frame is called fruitNutr > > Fruit Calories > 1 banana 100 > 2 pear 100 > 3 mango 200 > > And this data.frame is called fruitData > > Fruit Color Shape Juice > 1 apple red round 1 > 2 banana yellow oblong 0 > 3 pear green pear 0.5 > 4 orange orange round 1 > 5 kiwi green round 0 > > So, as you can see, these two data.frames overlap insofar as they both have > banana and pear. So, what happens next is the book suggests this: > > fruitData$calories <- NA > > > As a result, I've created a new column for the fruitData data.frame: > > Fruit Color Shape Juice Calories > 1 apple red round 1 N/A > 2 banana yellow oblong 0 N/A > 3 pear green pear 0.5 N/A > 4 orange orange round 1 N/A > 5 kiwi green round 0 N/A > > Then: > >> index <- match (x=fruitData$Fruit, table=fruitNutr$Fruit) >> index > [1] NA 1 2 NA NA >> is.na(index) > [1] TRUE FALSE FALSE TRUE TRUE >> fruitData$Calories [!is.na(index)] <- fruitNutr$Calories[index[!is.na > (index)]] >> fruitData > > Fruit Color Shape Juice Calories > 1 apple red round 1 N/A > 2 banana yellow oblong 0 100 > 3 pear green pear 0.5 100 > 4 orange orange round 1 N/A > 5 kiwi green round 0 N/A > > I get what the first part means, that first part being this: > fruitData$Calories [!is.na(index)] > go into the fruitData data.frame, specifically into the calories column, > and only for what's true according to is.na(index). But I just literally > can't understand this last part. fruitNutr$Calories[index[!is.na(index)]] > > Two questions. > > > 1. I just literally don't understand how this code works. It does work, > of course, but I don't know what it's doing, specifically this [index[! > is.na(index)]] part. Could someone explain it to me like I'm five? I'm > new at this... > 2. And then: is there any other way to combine these two data.frames so > that we get this same result? maybe an easier to understand method? > > That same result, again, is > > Fruit Color Shape Juice Calories > 1 apple red round 1 N/A > 2 banana yellow oblong 0 100 > 3 pear green pear 0.5 100 > 4 orange orange round 1 N/A > 5 kiwi green round 0 N/A > > > Drake > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.