try this: > x <- data.frame(a=c('cat', 'cat,dog', 'dog', 'dog,cat')) > x a 1 cat 2 cat,dog 3 dog 4 dog,cat > levels(x$a) [1] "cat" "cat,dog" "dog" "dog,cat" > # change the factors > x$a <- factor(sapply(strsplit(as.character(x$a), ','), '[[', 1)) > x a 1 cat 2 cat 3 dog 4 dog > levels(x$a) [1] "cat" "dog"
On Thu, Dec 10, 2009 at 10:53 PM, Jennifer Walsh <walsh...@umich.edu> wrote: > Hi all, > > I've Googled far and wide but don't think I know the correct terms to > search for to find an answer. > > I have a massive dataset where one of the factors is made up of both > individual items and lists of items (for example, "cat" and "cat, dog, > bird"). I would like to recode this factor somehow into only the first > element of the list (so every list starting with "cat," plus the > observations that were already just "cat" would all be set equal to "cat"). > I would ideally like to do this in some simple way that does not require me > to write hundreds of different sets of code (since the lists probably start > with 300+ different items). Is this possible? Extremely complicated? > > Also, I am sure this is much simpler, but I cannot seem to get rid of > levels of a factor that have no observations. I have tried setting the > levels of the factor to only the ones with observations that I am interested > in, but every time I summarize the variable there are still 100+ labels all > with "0" as their count. This hasn't happened to me before; is there an > explanation for it? > > Thanks very much, > Jen > > --- > Jennifer Walsh > Graduate Student, Developmental Psychology > University of Michigan > 2020 East Hall, 530 Church St. > Ann Arbor, MI 48109-1043 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.