Jeff, Even though the solutions from the previous responders are good enough for my current situation, the principle you just raised will be definitely beneficial to your future work. Thanks a lot for sharing the insights!
Gang On Thu, Jul 17, 2014 at 12:06 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote: > You ask about generic methods for introducing alternate values for factors, > and some of the other responses address this quite efficiently. > > However, a factor has meaning only within one vector at a time, since > another vector may have additional values or missing values relative to > the first vector. For example, you used the "sample" function which > is not guaranteed to select at least one of each of the four letters in L4. > Or, what if the data has values the mapping doesn't address? > > For any work in which I am dealing with categorical data in multiple > places (e.g. your "d" data frame and whatever data structure you use > to define your mapping) I prefer NOT to work with factors until all of > my categories of data are moved into one vector (typically a column > in a data frame). Rather, I work with character vectors during the > data manipulation phase and only convert to factor when I start > analyzing or displaying the data. > > With this in mind, I use a general flow something like: > > d <- data.frame( x = 1, y = 1:10, fac = fac, stringsAsFactors=FALSE ) > mp <- data.frame( fac=LETTERS[1:4], value=c(8,11,3,2) ) > d2 <- merge( d, mp, all.x=TRUE ) > d2$fac <- factor( d2$fac ) # optional > > If you actually are in the analysis phase and are not pulling data from > multiple external sources, then you may have already confirmed the > completeness and range of values you have to work with then one of the other > more efficient methods may still be a better choice for this specific task. > > Hadley Wickham's "tidy data" [1] principles address this concern more > thoroughly than I have. > > [1] Google this phrase... paper seems to be a work in progress. > > > On Thu, 17 Jul 2014, Gang Chen wrote: > >> Suppose I have the following dataframe: >> >> L4 <- LETTERS[1:4] >> fac <- sample(L4, 10, replace = TRUE) >> (d <- data.frame(x = 1, y = 1:10, fac = fac)) >> >> x y fac >> 1 1 1 B >> 2 1 2 B >> 3 1 3 D >> 4 1 4 A >> 5 1 5 C >> 6 1 6 D >> 7 1 7 C >> 8 1 8 B >> 9 1 9 B >> 10 1 10 B >> >> I'd like to add another column 'var' that is defined based on the >> following mapping of column 'fac': >> >> A -> 8 >> B -> 11 >> C -> 3 >> D -> 2 >> >> How can I achieve this in an elegant way (with a generic approach for >> any length)? >> >> Thanks, >> Gang >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.