Re: [R] levels of factor when subsetting the factor
Yes. I do this periodically: dat.new <- dat[1:6, ] dat.new[] <- lapply(dat.new, function(x) if(is.factor(x)) factor(x) else x) HTH, --sundar Afshartous, David said the following on 9/12/2006 11:00 AM: > thanks to all for the quick replies! > > if the factor is part of a dataframe, I can apply the subsetting > to the entire dataframe, and then use drop=True to the factor > separately and then put it back into the new dataframe (code below). is > there a way > to do this in a single step? > > dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y > = rnorm(9)) > dat.new = dat[1:6, ] > dat.new$fact = dat$fact[1:6, drop = T] > > > > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard > Sent: Tuesday, September 12, 2006 11:45 AM > To: Afshartous, David > Cc: r-help@stat.math.ethz.ch > Subject: Re: [R] levels of factor when subsetting the factor > > "Afshartous, David" <[EMAIL PROTECTED]> writes: > >> >> All, >> >> When I take a subset of a factor the reduced factor still maintains >> all the original levels of the factor when say forming the key in a plot. >> The data is correct, but the variable still "remembers" the original >> levels. See below for reproducible code. Does anyone know how to fix >> this? >> cheers, >> dave >> >> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = >> fact[1:6] >>> new.fact >> [1] A A A B B B >> Levels: A B C## should only show A B > > Just use > >> factor(new.fact) > [1] A A A B B B > Levels: A B > > or > >> fact[1:6, drop=T] > [1] A A A B B B > Levels: A B > > > And, no, it is not a bug. The fact that a subsample happens to consist only > of males does not turn gender into a one-level factor... (Apart from the > philosophy, it makes a real difference in tabulation.) > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
"Afshartous, David" <[EMAIL PROTECTED]> writes: > > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B Just use > factor(new.fact) [1] A A A B B B Levels: A B or > fact[1:6, drop=T] [1] A A A B B B Levels: A B And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
thanks to all for the quick replies! if the factor is part of a dataframe, I can apply the subsetting to the entire dataframe, and then use drop=True to the factor separately and then put it back into the new dataframe (code below). is there a way to do this in a single step? dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y = rnorm(9)) dat.new = dat[1:6, ] dat.new$fact = dat$fact[1:6, drop = T] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Tuesday, September 12, 2006 11:45 AM To: Afshartous, David Cc: r-help@stat.math.ethz.ch Subject: Re: [R] levels of factor when subsetting the factor "Afshartous, David" <[EMAIL PROTECTED]> writes: > > All, > > When I take a subset of a factor the reduced factor still maintains > all the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = > fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B Just use > factor(new.fact) [1] A A A B B B Levels: A B or > fact[1:6, drop=T] [1] A A A B B B Levels: A B And, no, it is not a bug. The fact that a subsample happens to consist only of males does not turn gender into a one-level factor... (Apart from the philosophy, it makes a real difference in tabulation.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
check ?"[.factor", you need: fact[1:6, drop = TRUE] Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: "Afshartous, David" <[EMAIL PROTECTED]> To: Sent: Tuesday, September 12, 2006 5:22 PM Subject: [R] levels of factor when subsetting the factor > > All, > > When I take a subset of a factor the reduced factor still maintains > all > the original levels of the factor when say forming the key in a > plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to > fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] >> new.fact > [1] A A A B B B > Levels: A B C## should only show A B > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
Also, it is probably easier to use gl() than coerce your data into a factor fact <- gl(3, 3, label = c("A", "B", "C")) > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Tuesday, September 12, 2006 11:32 AM > To: Afshartous, David; r-help@stat.math.ethz.ch > Subject: Re: [R] levels of factor when subsetting the factor > > You have at least two choices: > > R> factor(fact[1:6]) > [1] A A A B B B > Levels: A B > R> fact[1:6, drop=TRUE] > [1] A A A B B B > Levels: A B > > HTH, > Andy > > > From: Afshartous, David > > > > All, > > > > When I take a subset of a factor the reduced factor still maintains > > all the original levels of the factor when say forming the key in a > > plot. > > The data is correct, but the variable still "remembers" the > original > > levels. See below for reproducible code. Does anyone know > how to fix > > this? > > cheers, > > dave > > > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = > > fact[1:6] > > > new.fact > > [1] A A A B B B > > Levels: A B C## should only show A B > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > -- > > Notice: This e-mail message, together with any > attachments,...{{dropped}} > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
factor(new.fact) will do the trick. But that will recode the levels and that might be something you don't want. > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > new.fact [1] A A A B B B Levels: A B C > factor(new.fact) [1] A A A B B B Levels: A B Cheers, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Afshartous, David Verzonden: dinsdag 12 september 2006 17:23 Aan: r-help@stat.math.ethz.ch Onderwerp: [R] levels of factor when subsetting the factor All, When I take a subset of a factor the reduced factor still maintains all the original levels of the factor when say forming the key in a plot. The data is correct, but the variable still "remembers" the original levels. See below for reproducible code. Does anyone know how to fix this? cheers, dave fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = fact[1:6] > new.fact [1] A A A B B B Levels: A B C## should only show A B __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
I think you want 'fact[1:6, drop = TRUE]' -roger Afshartous, David wrote: > > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] >> new.fact > [1] A A A B B B > Levels: A B C## should only show A B > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
On 9/12/06, Afshartous, David <[EMAIL PROTECTED]> wrote: > > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? Use the optional argument "drop = TRUE" > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B > fact[1:6, drop = TRUE] [1] A A A B B B Levels: A B __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
You have at least two choices: R> factor(fact[1:6]) [1] A A A B B B Levels: A B R> fact[1:6, drop=TRUE] [1] A A A B B B Levels: A B HTH, Andy From: Afshartous, David > > All, > > When I take a subset of a factor the reduced factor still > maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
Try > new.fact = fact[1:6, drop=TRUE] On 12/09/06, Afshartous, David <[EMAIL PROTECTED]> wrote: > > All, > > When I take a subset of a factor the reduced factor still maintains all > the original levels of the factor when say forming the key in a plot. > The data is correct, but the variable still "remembers" the original > levels. See below for reproducible code. Does anyone know how to fix > this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] levels of factor when subsetting the factor
Just add the following to your code new.fact = fact[1:6, drop=T] > new.fact [1] A A A B B B Levels: A B > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Afshartous, David > Sent: Tuesday, September 12, 2006 11:23 AM > To: r-help@stat.math.ethz.ch > Subject: [R] levels of factor when subsetting the factor > > > All, > > When I take a subset of a factor the reduced factor still > maintains all the original levels of the factor when say > forming the key in a plot. > The data is correct, but the variable still "remembers" the > original levels. See below for reproducible code. Does > anyone know how to fix this? > cheers, > dave > > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) > new.fact = fact[1:6] > > new.fact > [1] A A A B B B > Levels: A B C## should only show A B > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] levels of factor when subsetting the factor
All, When I take a subset of a factor the reduced factor still maintains all the original levels of the factor when say forming the key in a plot. The data is correct, but the variable still "remembers" the original levels. See below for reproducible code. Does anyone know how to fix this? cheers, dave fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = fact[1:6] > new.fact [1] A A A B B B Levels: A B C## should only show A B __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
RE: [R] levels of factor
Believe it or not, that's a feature, not a bug. The idea is that the factor COULD take on those levels, even if it doesn't in your particular subset. To drop them, you would have to re-initialize the factor as such: a$column2 <- factor(a$column2) Or, you could just download the Hmisc package, which redefines the subset operator "[" to behave as you'd like. Personally, I think the default behavior is clearer, however. By the way, there are some problems with your code. First of all, you should drop the quotes around column2--they're unnecessary. Secondly, your subset is redundant: only one of your factor levels can be numbered 1, so only one of the levels "factor1" and "factor2" is getting included in the result (whichever is numbered 1 -- I'm guessing it's "factor1"). Was this your intention? Kevin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis Rideau Cruz Sent: Tuesday, August 17, 2004 7:30 AM To: [EMAIL PROTECTED] Subject: [R] levels of factor R-help, I have a data frame wich I subset like : a <- subset(df,df$"column2" %in% c("factor1","factor2") & df$"column2"==1) But when I type levels(a$"column2") I still get the same levels as in df (my original data frame) Why is that? Is it right? Luis Luis Ridao Cruz Fiskirannsóknarstovan Nóatún 1 P.O. Box 3051 FR-110 Tórshavn Faroe Islands Phone: +298 353900 Phone(direct): +298 353912 Mobile: +298 580800 Fax: +298 353901 E-mail: [EMAIL PROTECTED] Web:www.frs.fo __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] levels of factor
On Tue, 2004-08-17 at 09:30, Luis Rideau Cruz wrote: > R-help, > > I have a data frame wich I subset like : > > a <- subset(df,df$"column2" %in% c("factor1","factor2") & df$"column2"==1) > > But when I type levels(a$"column2") I still get the same levels as in df (my > original data frame) > > Why is that? The default for [.factor is: x[i, drop = FALSE] Hence, unused factor levels are retained. > Is it right? Yes. If you want to explicitly recode the factor based upon only those levels that are actually in use, you can do something like the following: a <- factor(a) However, I am a bit unclear as to the logic of the subset statement that you are using, perhaps b/c I don't know what your data is. You seem to be subsetting 'column2' on both the factor levels and a presumed numeric code. Is that really what you want to do? You might want to review the "Warning" section in ?factor BTW, when using subset(), the evaluation takes place within the data frame, so you do not need to use df$"column2" in the function call. You can just use column2, for example: subset(df, column2 %in% c("factor1", "factor2")) See ?factor and ?"[.factor" for more information. HTH, Marc Schwartz __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] levels of factor
R-help, I have a data frame wich I subset like : a <- subset(df,df$"column2" %in% c("factor1","factor2") & df$"column2"==1) But when I type levels(a$"column2") I still get the same levels as in df (my original data frame) Why is that? Is it right? Luis Luis Ridao Cruz Fiskirannsóknarstovan Nóatún 1 P.O. Box 3051 FR-110 Tórshavn Faroe Islands Phone: +298 353900 Phone(direct): +298 353912 Mobile: +298 580800 Fax: +298 353901 E-mail: [EMAIL PROTECTED] Web:www.frs.fo __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html