Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Sundar Dorai-Raj
Yes. I do this periodically:

dat.new <- dat[1:6, ]
dat.new[] <- lapply(dat.new, function(x)
 if(is.factor(x)) factor(x) else x)

HTH,

--sundar

Afshartous, David said the following on 9/12/2006 11:00 AM:
> thanks to all for the quick replies!
> 
> if the factor is part of a dataframe, I can apply the subsetting
> to the entire dataframe, and then use drop=True to the factor
> separately and then put it back into the new dataframe (code below).  is 
> there a way
> to do this in a single step? 
> 
> dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y 
> = rnorm(9))
> dat.new = dat[1:6, ]
> dat.new$fact = dat$fact[1:6, drop = T]
> 
> 
>  
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
> Sent: Tuesday, September 12, 2006 11:45 AM
> To: Afshartous, David
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] levels of factor when subsetting the factor
> 
> "Afshartous, David" <[EMAIL PROTECTED]> writes:
> 
>>  
>> All,
>>
>> When I take a subset of a factor the reduced factor still maintains 
>> all the original levels of the factor when say forming the key in a plot.
>> The data is correct, but the variable still "remembers" the original 
>> levels.  See below for reproducible code.  Does anyone know how to fix 
>> this?
>> cheers,
>> dave
>>
>> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = 
>> fact[1:6]
>>> new.fact
>> [1] A A A B B B
>> Levels: A B C## should only show A B
> 
> Just use
> 
>> factor(new.fact)
> [1] A A A B B B
> Levels: A B
> 
> or
> 
>> fact[1:6, drop=T]
> [1] A A A B B B
> Levels: A B
> 
> 
> And, no, it is not a bug. The fact that a subsample happens to consist only 
> of males does not turn gender into a one-level factor... (Apart from the 
> philosophy, it makes a real difference in tabulation.) 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Peter Dalgaard
"Afshartous, David" <[EMAIL PROTECTED]> writes:

>  
> All,
> 
> When I take a subset of a factor the reduced factor still maintains all
> the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to fix
> this?
> cheers,
> dave
> 
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B

Just use

> factor(new.fact)
[1] A A A B B B
Levels: A B

or

> fact[1:6, drop=T]
[1] A A A B B B
Levels: A B


And, no, it is not a bug. The fact that a subsample happens to consist
only of males does not turn gender into a one-level factor... (Apart
from the philosophy, it makes a real difference in tabulation.) 


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Afshartous, David

thanks to all for the quick replies!

if the factor is part of a dataframe, I can apply the subsetting
to the entire dataframe, and then use drop=True to the factor
separately and then put it back into the new dataframe (code below).  is there 
a way
to do this in a single step? 

dat <-data.frame(fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))),Y = 
rnorm(9))
dat.new = dat[1:6, ]
dat.new$fact = dat$fact[1:6, drop = T]


 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard
Sent: Tuesday, September 12, 2006 11:45 AM
To: Afshartous, David
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] levels of factor when subsetting the factor

"Afshartous, David" <[EMAIL PROTECTED]> writes:

>  
> All,
> 
> When I take a subset of a factor the reduced factor still maintains 
> all the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original 
> levels.  See below for reproducible code.  Does anyone know how to fix 
> this?
> cheers,
> dave
> 
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) new.fact = 
> fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B

Just use

> factor(new.fact)
[1] A A A B B B
Levels: A B

or

> fact[1:6, drop=T]
[1] A A A B B B
Levels: A B


And, no, it is not a bug. The fact that a subsample happens to consist only of 
males does not turn gender into a one-level factor... (Apart from the 
philosophy, it makes a real difference in tabulation.) 


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Dimitris Rizopoulos
check ?"[.factor", you need:

fact[1:6, drop = TRUE]


Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: "Afshartous, David" <[EMAIL PROTECTED]>
To: 
Sent: Tuesday, September 12, 2006 5:22 PM
Subject: [R] levels of factor when subsetting the factor


>
> All,
>
> When I take a subset of a factor the reduced factor still maintains 
> all
> the original levels of the factor when say forming the key in a 
> plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to 
> fix
> this?
> cheers,
> dave
>
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
>> new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Doran, Harold
Also, it is probably easier to use gl() than coerce your data into a
factor

fact <- gl(3, 3, label = c("A", "B", "C")) 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy
> Sent: Tuesday, September 12, 2006 11:32 AM
> To: Afshartous, David; r-help@stat.math.ethz.ch
> Subject: Re: [R] levels of factor when subsetting the factor
> 
> You have at least two choices:
> 
> R> factor(fact[1:6])
> [1] A A A B B B
> Levels: A B
> R> fact[1:6, drop=TRUE]
> [1] A A A B B B
> Levels: A B
> 
> HTH,
> Andy
> 
> 
> From: Afshartous, David
> >  
> > All,
> > 
> > When I take a subset of a factor the reduced factor still maintains 
> > all the original levels of the factor when say forming the key in a 
> > plot.
> > The data is correct, but the variable still "remembers" the 
> original 
> > levels.  See below for reproducible code.  Does anyone know 
> how to fix 
> > this?
> > cheers,
> > dave
> > 
> > fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) 
> new.fact = 
> > fact[1:6]
> > > new.fact
> > [1] A A A B B B
> > Levels: A B C## should only show A B
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> 
> 
> --
> 
> Notice:  This e-mail message, together with any 
> attachments,...{{dropped}}
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread ONKELINX, Thierry
factor(new.fact) will do the trick. But that will recode the levels and
that might be something you don't want.

> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
> new.fact
[1] A A A B B B
Levels: A B C
> factor(new.fact)
[1] A A A B B B
Levels: A B

Cheers,

Thierry




ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature
and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

[EMAIL PROTECTED]

www.inbo.be 


-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Namens Afshartous, David
Verzonden: dinsdag 12 september 2006 17:23
Aan: r-help@stat.math.ethz.ch
Onderwerp: [R] levels of factor when subsetting the factor

 
All,

When I take a subset of a factor the reduced factor still maintains all
the original levels of the factor when say forming the key in a plot.
The data is correct, but the variable still "remembers" the original
levels.  See below for reproducible code.  Does anyone know how to fix
this?
cheers,
dave

fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
new.fact = fact[1:6]
> new.fact
[1] A A A B B B
Levels: A B C## should only show A B

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Roger D. Peng
I think you want 'fact[1:6, drop = TRUE]'

-roger

Afshartous, David wrote:
>  
> All,
> 
> When I take a subset of a factor the reduced factor still maintains all
> the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to fix
> this?
> cheers,
> dave
> 
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
>> new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Roger D. Peng  |  http://www.biostat.jhsph.edu/~rpeng/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Douglas Bates
On 9/12/06, Afshartous, David <[EMAIL PROTECTED]> wrote:
>
> All,
>
> When I take a subset of a factor the reduced factor still maintains all
> the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to fix
> this?

Use the optional argument "drop = TRUE"

> cheers,
> dave
>
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B

> fact[1:6, drop = TRUE]
[1] A A A B B B
Levels: A B

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Liaw, Andy
You have at least two choices:

R> factor(fact[1:6])
[1] A A A B B B
Levels: A B
R> fact[1:6, drop=TRUE]
[1] A A A B B B
Levels: A B

HTH,
Andy


From: Afshartous, David
>  
> All,
> 
> When I take a subset of a factor the reduced factor still 
> maintains all
> the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to fix
> this?
> cheers,
> dave
> 
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread David Barron
Try

> new.fact = fact[1:6, drop=TRUE]



On 12/09/06, Afshartous, David <[EMAIL PROTECTED]> wrote:
>
> All,
>
> When I take a subset of a factor the reduced factor still maintains all
> the original levels of the factor when say forming the key in a plot.
> The data is correct, but the variable still "remembers" the original
> levels.  See below for reproducible code.  Does anyone know how to fix
> this?
> cheers,
> dave
>
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
> new.fact = fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] levels of factor when subsetting the factor

2006-09-12 Thread Doran, Harold
Just add the following to your code

new.fact = fact[1:6, drop=T]

> new.fact
[1] A A A B B B
Levels: A B 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Afshartous, David
> Sent: Tuesday, September 12, 2006 11:23 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] levels of factor when subsetting the factor
> 
>  
> All,
> 
> When I take a subset of a factor the reduced factor still 
> maintains all the original levels of the factor when say 
> forming the key in a plot.
> The data is correct, but the variable still "remembers" the 
> original levels.  See below for reproducible code.  Does 
> anyone know how to fix this?
> cheers,
> dave
> 
> fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3))) 
> new.fact = fact[1:6]
> > new.fact
> [1] A A A B B B
> Levels: A B C## should only show A B
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] levels of factor when subsetting the factor

2006-09-12 Thread Afshartous, David
 
All,

When I take a subset of a factor the reduced factor still maintains all
the original levels of the factor when say forming the key in a plot.
The data is correct, but the variable still "remembers" the original
levels.  See below for reproducible code.  Does anyone know how to fix
this?
cheers,
dave

fact = as.factor(c(rep("A", 3),rep("B", 3), rep("C", 3)))
new.fact = fact[1:6]
> new.fact
[1] A A A B B B
Levels: A B C## should only show A B

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


RE: [R] levels of factor

2004-08-17 Thread Kevin Bartz
Believe it or not, that's a feature, not a bug. The idea is that the factor
COULD take on those levels, even if it doesn't in your particular subset. To
drop them, you would have to re-initialize the factor as such:

a$column2 <- factor(a$column2)

Or, you could just download the Hmisc package, which redefines the subset
operator "[" to behave as you'd like. Personally, I think the default
behavior is clearer, however.

By the way, there are some problems with your code. First of all, you should
drop the quotes around column2--they're unnecessary. Secondly, your subset
is redundant: only one of your factor levels can be numbered 1, so only one
of the levels "factor1" and "factor2" is getting included in the result
(whichever is numbered 1 -- I'm guessing it's "factor1"). Was this your
intention?

Kevin

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Luis Rideau Cruz
Sent: Tuesday, August 17, 2004 7:30 AM
To: [EMAIL PROTECTED]
Subject: [R] levels of factor

R-help,

I have a data frame wich I subset like :

a <- subset(df,df$"column2" %in% c("factor1","factor2")  & df$"column2"==1)

But when I type levels(a$"column2") I still get the same levels as in df (my
original data frame)

Why is that?
Is it right?

Luis

Luis Ridao Cruz
Fiskirannsóknarstovan
Nóatún 1
P.O. Box 3051
FR-110 Tórshavn
Faroe Islands
Phone: +298 353900
Phone(direct): +298 353912
Mobile: +298 580800
Fax: +298 353901
E-mail:  [EMAIL PROTECTED]
Web:www.frs.fo

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] levels of factor

2004-08-17 Thread Marc Schwartz
On Tue, 2004-08-17 at 09:30, Luis Rideau Cruz wrote:
> R-help,
> 
> I have a data frame wich I subset like :
> 
> a <- subset(df,df$"column2" %in% c("factor1","factor2")  & df$"column2"==1)
> 
> But when I type levels(a$"column2") I still get the same levels as in df (my 
> original data frame)
> 
> Why is that?

The default for [.factor is:

x[i, drop = FALSE]

Hence, unused factor levels are retained.

> Is it right?

Yes.

If you want to explicitly recode the factor based upon only those levels
that are actually in use, you can do something like the following:

a <- factor(a)


However, I am a bit unclear as to the logic of the subset statement that
you are using, perhaps b/c I don't know what your data is.

You seem to be subsetting 'column2' on both the factor levels and a
presumed numeric code. Is that really what you want to do?

You might want to review the "Warning" section in ?factor

BTW, when using subset(), the evaluation takes place within the data
frame, so you do not need to use df$"column2" in the function call. You
can just use column2, for example:

subset(df, column2 %in% c("factor1", "factor2"))

See ?factor and ?"[.factor" for more information.

HTH,

Marc Schwartz

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] levels of factor

2004-08-17 Thread Luis Rideau Cruz
R-help,

I have a data frame wich I subset like :

a <- subset(df,df$"column2" %in% c("factor1","factor2")  & df$"column2"==1)

But when I type levels(a$"column2") I still get the same levels as in df (my original 
data frame)

Why is that?
Is it right?

Luis

Luis Ridao Cruz
Fiskirannsóknarstovan
Nóatún 1
P.O. Box 3051
FR-110 Tórshavn
Faroe Islands
Phone: +298 353900
Phone(direct): +298 353912
Mobile: +298 580800
Fax: +298 353901
E-mail:  [EMAIL PROTECTED]
Web:www.frs.fo

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html