Re: [R] problems with coercing a factor to be numeric

2013-01-24 Thread Francesco Sarracino
Dear Ellison,

thanks a lot for your reply. Your explanation makes things much clearer.
Sincerely,
f.


On 24 January 2013 05:58, S Ellison  wrote:

>
>
> On 23 Jan 2013, at 21:36, "Francesco Sarracino" 
> wrote:
>
> >  what I meant refers to the fact  that  I've read on "an R and
> > S-plus companion to applied regression" about methods to alter the
> encoding
> > of factors when using contrasts in regressions. These are options (for
> > contrasts) that can be easily set as "option('contrasts')". This command
> > changes the way R creates the dummies out of a factor and various methods
> > are available.
> > I was expecting that R might have had something similar that applied to
> my
> > case, thus changing the way R attaches numeric values to my dummy
> variable.
> > I am just surprised that such option doesn't exist. I was having wrong
> > expectations.
>
> Such options do exist, but at modelling time, not factor
> creation/conversion time.
>
> When created, by calls to 'factor' or in functions like 'read.table',
> factors are stored internally as integers with a list of labels (what you
> see as factor levels) that go with each integer. Those internal integers
> start at 1 and go up. You can set the ordering of those labels (by
> specifying the "levels" argument in factor()) so that, for example, yes and
> no can be associated with (numeric) factor levels 1 and 2 respectively
> instead of the default ordering which would put 'no' alphabetically before
> 'yes'. (I find this choice particularly useful for orderings like "high",
> "medium", "low" for which the alphabetic ordering is not exactly intuitive;
> similarly alphabetic ordering puts '1', '2', '10' in the order '1', '10',
> '2' and so on, so that often needs specifying manually. It's also useful to
> specify levels if you want things like boxplots to come out in a particular
> order, as boxplots by default use the order of the factor levels).
> The internal integer values are returned by 'as numeric'. If your factor
> level labels - which are always character - are also interpretable as
> numbers, you need 'as.character' to return the character strings and then
> 'as.numeric' to convert those.
>
> Now, up to this point you just have more or less arbitrary integers
> asociated with the original factor levels (the degree of arbitrariness
> depends on whether you specified the level order or let R use its default).
> These integers are not the contrasts used in model fitting. Contrasts are
> set at model matrix building time; they are not a fixed attribute of the
> factor. The internal numbering of levels  affects contrasts only to the
> extent that the numerical values used in setting contrasts are usually in
> the same order as the factor levels.  You can inspect the functions used to
> associate contrasts  with factor levels by using options("contrasts"). You
> can inspect the numerical values that would currently be used for a given
> factor with a call to contrasts(). You can change the contrast asignments
> globally using options() or explicitly in some model calls (lm, for
> example, has a contrasts argument) and if you like you can write your own
> contrast functions to set any values you like.  The most common are
> probably treatment contrasts, which set the first factor level as intercept
> and the rest as (unit) differences from that, and sum to zero contrasts
> which do what they say, setting contrasts that sum to zero by choosing a
> set like (-1, 0, 1).
>
> So you actually have a great deal of control over both the order in which
> labels are associated with factor levels and the (separate) values of
> contrasts associated with those factor levels at modelling time.
>
> The cost of that control is some complexity, and the time needed to learn
> what's going on to use it all properly.
>
> Hope that helps ...
>
>
> S Ellison
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:18}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread S Ellison


On 23 Jan 2013, at 21:36, "Francesco Sarracino"  wrote:

>  what I meant refers to the fact  that  I've read on "an R and
> S-plus companion to applied regression" about methods to alter the encoding
> of factors when using contrasts in regressions. These are options (for
> contrasts) that can be easily set as "option('contrasts')". This command
> changes the way R creates the dummies out of a factor and various methods
> are available.
> I was expecting that R might have had something similar that applied to my
> case, thus changing the way R attaches numeric values to my dummy variable.
> I am just surprised that such option doesn't exist. I was having wrong
> expectations.

Such options do exist, but at modelling time, not factor creation/conversion 
time.

When created, by calls to 'factor' or in functions like 'read.table', factors 
are stored internally as integers with a list of labels (what you see as factor 
levels) that go with each integer. Those internal integers start at 1 and go 
up. You can set the ordering of those labels (by specifying the "levels" 
argument in factor()) so that, for example, yes and no can be associated with 
(numeric) factor levels 1 and 2 respectively instead of the default ordering 
which would put 'no' alphabetically before 'yes'. (I find this choice 
particularly useful for orderings like "high", "medium", "low" for which the 
alphabetic ordering is not exactly intuitive; similarly alphabetic ordering 
puts '1', '2', '10' in the order '1', '10', '2' and so on, so that often needs 
specifying manually. It's also useful to specify levels if you want things like 
boxplots to come out in a particular order, as boxplots by default use the 
order of the factor levels).
The internal integer values are returned by 'as numeric'. If your factor level 
labels - which are always character - are also interpretable as numbers, you 
need 'as.character' to return the character strings and then 'as.numeric' to 
convert those. 

Now, up to this point you just have more or less arbitrary integers asociated 
with the original factor levels (the degree of arbitrariness depends on whether 
you specified the level order or let R use its default). These integers are not 
the contrasts used in model fitting. Contrasts are set at model matrix building 
time; they are not a fixed attribute of the factor. The internal numbering of 
levels  affects contrasts only to the extent that the numerical values used in 
setting contrasts are usually in the same order as the factor levels.  You can 
inspect the functions used to associate contrasts  with factor levels by using 
options("contrasts"). You can inspect the numerical values that would currently 
be used for a given factor with a call to contrasts(). You can change the 
contrast asignments globally using options() or explicitly in some model calls 
(lm, for example, has a contrasts argument) and if you like you can write your 
own contrast functions to set any values you!
  like.  The most common are probably treatment contrasts, which set the first 
factor level as intercept and the rest as (unit) differences from that, and sum 
to zero contrasts which do what they say, setting contrasts that sum to zero by 
choosing a set like (-1, 0, 1). 

So you actually have a great deal of control over both the order in which 
labels are associated with factor levels and the (separate) values of contrasts 
associated with those factor levels at modelling time. 

The cost of that control is some complexity, and the time needed to learn 
what's going on to use it all properly. 

Hope that helps ...


S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread Francesco Sarracino
Thank you all for your replies. Let me try to explain my point: first of
all, let me clarify that I didn't mean to criticize anyone (or anything).
Secondly, what I meant refers to the fact  that  I've read on "an R and
S-plus companion to applied regression" about methods to alter the encoding
of factors when using contrasts in regressions. These are options (for
contrasts) that can be easily set as "option('contrasts')". This command
changes the way R creates the dummies out of a factor and various methods
are available.
I was expecting that R might have had something similar that applied to my
case, thus changing the way R attaches numeric values to my dummy variable.
I am just surprised that such option doesn't exist. I was having wrong
expectations.
Thank you all for helping me clarifying this point.
f.


On 23 January 2013 21:55, Rolf Turner  wrote:

>
> Given that your labels are "no" and "yes", what do you expect R to
> do?  To quote a well-known fortune, "R is lacking a mind_read() function!"
>
> cheers,
>
> Rolf Turner
>
>
> On 01/23/2013 10:58 PM, Francesco Sarracino wrote:
>
>> Thanks,
>> this works! but I am surprised that R has such a strange behavior and that
>> there is no way to control it.
>> BTW, also as.integer(pp)-1 works!
>> Still, it doesn't look to me as a first best.
>> At any rate, thanks a lot for your help.
>> f.
>>
>>
>> On 23 January 2013 10:53, D. Rizopoulos 
>> wrote:
>>
>>  check also
>>>
>>> pp <- rep(0:1, 10)
>>> pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
>>>
>>> unclass(pp)
>>> unclass(pp) - 1
>>>
>>>
>>> Best,
>>> Dimitris
>>>
>>>
>>> On 1/23/2013 10:48 AM, Francesco Sarracino wrote:
>>>
 Dear Dimitris,

 thanks for your quick reply. I've tried the solutions proposed in 7.10
 How do I convert factors to numeric?

 as.numeric(as.character(pp))
 and
 as.numeric(levels(pp))[as.**integer(pp)]

 However, whatever I do, I get "Warning message: NAs introduced by

>>> coercion"
>>>
 and the output is a vector of NA.

 Any ideas?
 f.



 On 23 January 2013 10:39, D. Rizopoulos >>> >
 wrote:

  Check R FAQ 7.10: How do I convert factors to numeric?


  I hope it helps.

  Best,
  Dimitris


  On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
   > Dear R listers,
   >
   > I am trying to compute the mean of a dummy variable that is
  encoded as a
   > factor. However, even though the levels of my factor are 0 - 1,
  when I
   > compute the mean (after coercing the factor to be
   > numeric), R changes 0 into 1 and 1 into yes, thus altering my
  expected
   > result.
   >
   > Please, consider the following working example:
   > pp <- rep(0:1, 10)
   > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
   > mean(pp) #this won't work because the argument is not numeric or
  logical
   > mean(as.integer(pp)) # this computes the average, but not on the
  range 0-1,
   > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
   >
   > What am I doing wrong?
   > Thanks in advance for your kind support,
   > f.
   >
   >

  --
  Dimitris Rizopoulos
  Assistant Professor
  Department of Biostatistics
  Erasmus University Medical Center

  Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
  Tel: +31/(0)10/7043478 
  Fax: +31/(0)10/7043014 
  Web: 
 http://www.erasmusmc.nl/**biostatistiek/




 --
 Francesco Sarracino, Ph.D.
 https://sites.google.com/site/**fsarracino/

>>> --
>>> Dimitris Rizopoulos
>>> Assistant Professor
>>> Department of Biostatistics
>>> Erasmus University Medical Center
>>>
>>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>>> Tel: +31/(0)10/7043478
>>> Fax: +31/(0)10/7043014
>>> Web: 
>>> http://www.erasmusmc.nl/**biostatistiek/
>>>
>>>
>>
>>
>


-- 
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread Rolf Turner


Given that your labels are "no" and "yes", what do you expect R to
do?  To quote a well-known fortune, "R is lacking a mind_read() function!"

cheers,

Rolf Turner

On 01/23/2013 10:58 PM, Francesco Sarracino wrote:

Thanks,
this works! but I am surprised that R has such a strange behavior and that
there is no way to control it.
BTW, also as.integer(pp)-1 works!
Still, it doesn't look to me as a first best.
At any rate, thanks a lot for your help.
f.


On 23 January 2013 10:53, D. Rizopoulos  wrote:


check also

pp <- rep(0:1, 10)
pp <- factor(pp, levels=(0:1), labels=c("no","yes"))

unclass(pp)
unclass(pp) - 1


Best,
Dimitris


On 1/23/2013 10:48 AM, Francesco Sarracino wrote:

Dear Dimitris,

thanks for your quick reply. I've tried the solutions proposed in 7.10
How do I convert factors to numeric?

as.numeric(as.character(pp))
and
as.numeric(levels(pp))[as.integer(pp)]

However, whatever I do, I get "Warning message: NAs introduced by

coercion"

and the output is a vector of NA.

Any ideas?
f.



On 23 January 2013 10:39, D. Rizopoulos mailto:d.rizopou...@erasmusmc.nl>> wrote:

 Check R FAQ 7.10: How do I convert factors to numeric?


 I hope it helps.

 Best,
 Dimitris


 On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
  > Dear R listers,
  >
  > I am trying to compute the mean of a dummy variable that is
 encoded as a
  > factor. However, even though the levels of my factor are 0 - 1,
 when I
  > compute the mean (after coercing the factor to be
  > numeric), R changes 0 into 1 and 1 into yes, thus altering my
 expected
  > result.
  >
  > Please, consider the following working example:
  > pp <- rep(0:1, 10)
  > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
  > mean(pp) #this won't work because the argument is not numeric or
 logical
  > mean(as.integer(pp)) # this computes the average, but not on the
 range 0-1,
  > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
  >
  > What am I doing wrong?
  > Thanks in advance for your kind support,
  > f.
  >
  >

 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center

 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478 
 Fax: +31/(0)10/7043014 
 Web: http://www.erasmusmc.nl/biostatistiek/




--
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread William Dunlap
To find the proportion of "yes"s in pp you can use
   mean(pp == "yes")
and avoid the conversion of a factor to integer (and
subtracting 1).  The above works for character and factor
pp.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Francesco Sarracino
> Sent: Wednesday, January 23, 2013 1:59 AM
> To: D. Rizopoulos
> Cc: R help
> Subject: Re: [R] problems with coercing a factor to be numeric
> 
> Thanks,
> this works! but I am surprised that R has such a strange behavior and that
> there is no way to control it.
> BTW, also as.integer(pp)-1 works!
> Still, it doesn't look to me as a first best.
> At any rate, thanks a lot for your help.
> f.
> 
> 
> On 23 January 2013 10:53, D. Rizopoulos  wrote:
> 
> > check also
> >
> > pp <- rep(0:1, 10)
> > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
> >
> > unclass(pp)
> > unclass(pp) - 1
> >
> >
> > Best,
> > Dimitris
> >
> >
> > On 1/23/2013 10:48 AM, Francesco Sarracino wrote:
> > > Dear Dimitris,
> > >
> > > thanks for your quick reply. I've tried the solutions proposed in 7.10
> > > How do I convert factors to numeric?
> > >
> > > as.numeric(as.character(pp))
> > > and
> > > as.numeric(levels(pp))[as.integer(pp)]
> > >
> > > However, whatever I do, I get "Warning message: NAs introduced by
> > coercion"
> > > and the output is a vector of NA.
> > >
> > > Any ideas?
> > > f.
> > >
> > >
> > >
> > > On 23 January 2013 10:39, D. Rizopoulos  > > <mailto:d.rizopou...@erasmusmc.nl>> wrote:
> > >
> > > Check R FAQ 7.10: How do I convert factors to numeric?
> > >
> > >
> > > I hope it helps.
> > >
> > > Best,
> > > Dimitris
> > >
> > >
> > > On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
> > >  > Dear R listers,
> > >  >
> > >  > I am trying to compute the mean of a dummy variable that is
> > > encoded as a
> > >  > factor. However, even though the levels of my factor are 0 - 1,
> > > when I
> > >  > compute the mean (after coercing the factor to be
> > >  > numeric), R changes 0 into 1 and 1 into yes, thus altering my
> > > expected
> > >  > result.
> > >  >
> > >  > Please, consider the following working example:
> > >  > pp <- rep(0:1, 10)
> > >  > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
> > >  > mean(pp) #this won't work because the argument is not numeric or
> > > logical
> > >  > mean(as.integer(pp)) # this computes the average, but not on the
> > > range 0-1,
> > >  > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
> > >  >
> > >  > What am I doing wrong?
> > >  > Thanks in advance for your kind support,
> > >  > f.
> > >  >
> > >  >
> > >
> > > --
> > > Dimitris Rizopoulos
> > > Assistant Professor
> > > Department of Biostatistics
> > > Erasmus University Medical Center
> > >
> > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> > > Tel: +31/(0)10/7043478 
> > > Fax: +31/(0)10/7043014 
> > > Web: http://www.erasmusmc.nl/biostatistiek/
> > >
> > >
> > >
> > >
> > > --
> > > Francesco Sarracino, Ph.D.
> > > https://sites.google.com/site/fsarracino/
> >
> > --
> > Dimitris Rizopoulos
> > Assistant Professor
> > Department of Biostatistics
> > Erasmus University Medical Center
> >
> > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> > Tel: +31/(0)10/7043478
> > Fax: +31/(0)10/7043014
> > Web: http://www.erasmusmc.nl/biostatistiek/
> >
> 
> 
> 
> --
> Francesco Sarracino, Ph.D.
> https://sites.google.com/site/fsarracino/
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread David Winsemius


On Jan 23, 2013, at 1:58 AM, Francesco Sarracino wrote:


Thanks,
this works! but I am surprised that R has such a strange behavior  
and that

there is no way to control it.
BTW, also as.integer(pp)-1 works!
Still, it doesn't look to me as a first best.
At any rate, thanks a lot for your help.


I think it is rather strange that you are criticising R because the  
mean or sum functions won't coerce factors to numeric class. R is  
already very loosely typed. It has a fairly limited number of object  
classes and there is widespread class coercion when it is appropriate.  
Can you explain why you believed factors or by logical extension  
character classed variables should get implicitly coerced by all  
mathematical functions?


--
David.



f.


On 23 January 2013 10:53, D. Rizopoulos   
wrote:



check also

pp <- rep(0:1, 10)
pp <- factor(pp, levels=(0:1), labels=c("no","yes"))

unclass(pp)
unclass(pp) - 1


Best,
Dimitris


On 1/23/2013 10:48 AM, Francesco Sarracino wrote:

Dear Dimitris,

thanks for your quick reply. I've tried the solutions proposed in  
7.10

How do I convert factors to numeric?

as.numeric(as.character(pp))
and
as.numeric(levels(pp))[as.integer(pp)]

However, whatever I do, I get "Warning message: NAs introduced by

coercion"

and the output is a vector of NA.

Any ideas?
f.



On 23 January 2013 10:39, D. Rizopoulos mailto:d.rizopou...@erasmusmc.nl>> wrote:

   Check R FAQ 7.10: How do I convert factors to numeric?


   I hope it helps.

   Best,
   Dimitris


   On 1/23/2013 10:33 AM, Francesco Sarracino wrote:

Dear R listers,

I am trying to compute the mean of a dummy variable that is

   encoded as a

factor. However, even though the levels of my factor are 0 - 1,

   when I

compute the mean (after coercing the factor to be
numeric), R changes 0 into 1 and 1 into yes, thus altering my

   expected

result.

Please, consider the following working example:
pp <- rep(0:1, 10)
pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
mean(pp) #this won't work because the argument is not numeric or

   logical

mean(as.integer(pp)) # this computes the average, but not on the

   range 0-1,

but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.

What am I doing wrong?
Thanks in advance for your kind support,
f.




   --
   Dimitris Rizopoulos
   Assistant Professor
   Department of Biostatistics
   Erasmus University Medical Center

   Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
   Tel: +31/(0)10/7043478 
   Fax: +31/(0)10/7043014 
   Web: http://www.erasmusmc.nl/biostatistiek/




--
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/





--
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread Francesco Sarracino
Thanks,
this works! but I am surprised that R has such a strange behavior and that
there is no way to control it.
BTW, also as.integer(pp)-1 works!
Still, it doesn't look to me as a first best.
At any rate, thanks a lot for your help.
f.


On 23 January 2013 10:53, D. Rizopoulos  wrote:

> check also
>
> pp <- rep(0:1, 10)
> pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
>
> unclass(pp)
> unclass(pp) - 1
>
>
> Best,
> Dimitris
>
>
> On 1/23/2013 10:48 AM, Francesco Sarracino wrote:
> > Dear Dimitris,
> >
> > thanks for your quick reply. I've tried the solutions proposed in 7.10
> > How do I convert factors to numeric?
> >
> > as.numeric(as.character(pp))
> > and
> > as.numeric(levels(pp))[as.integer(pp)]
> >
> > However, whatever I do, I get "Warning message: NAs introduced by
> coercion"
> > and the output is a vector of NA.
> >
> > Any ideas?
> > f.
> >
> >
> >
> > On 23 January 2013 10:39, D. Rizopoulos  > > wrote:
> >
> > Check R FAQ 7.10: How do I convert factors to numeric?
> >
> >
> > I hope it helps.
> >
> > Best,
> > Dimitris
> >
> >
> > On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
> >  > Dear R listers,
> >  >
> >  > I am trying to compute the mean of a dummy variable that is
> > encoded as a
> >  > factor. However, even though the levels of my factor are 0 - 1,
> > when I
> >  > compute the mean (after coercing the factor to be
> >  > numeric), R changes 0 into 1 and 1 into yes, thus altering my
> > expected
> >  > result.
> >  >
> >  > Please, consider the following working example:
> >  > pp <- rep(0:1, 10)
> >  > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
> >  > mean(pp) #this won't work because the argument is not numeric or
> > logical
> >  > mean(as.integer(pp)) # this computes the average, but not on the
> > range 0-1,
> >  > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
> >  >
> >  > What am I doing wrong?
> >  > Thanks in advance for your kind support,
> >  > f.
> >  >
> >  >
> >
> > --
> > Dimitris Rizopoulos
> > Assistant Professor
> > Department of Biostatistics
> > Erasmus University Medical Center
> >
> > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> > Tel: +31/(0)10/7043478 
> > Fax: +31/(0)10/7043014 
> > Web: http://www.erasmusmc.nl/biostatistiek/
> >
> >
> >
> >
> > --
> > Francesco Sarracino, Ph.D.
> > https://sites.google.com/site/fsarracino/
>
> --
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus University Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
> Web: http://www.erasmusmc.nl/biostatistiek/
>



-- 
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread D. Rizopoulos
check also

pp <- rep(0:1, 10)
pp <- factor(pp, levels=(0:1), labels=c("no","yes"))

unclass(pp)
unclass(pp) - 1


Best,
Dimitris


On 1/23/2013 10:48 AM, Francesco Sarracino wrote:
> Dear Dimitris,
>
> thanks for your quick reply. I've tried the solutions proposed in 7.10
> How do I convert factors to numeric?
>
> as.numeric(as.character(pp))
> and
> as.numeric(levels(pp))[as.integer(pp)]
>
> However, whatever I do, I get "Warning message: NAs introduced by coercion"
> and the output is a vector of NA.
>
> Any ideas?
> f.
>
>
>
> On 23 January 2013 10:39, D. Rizopoulos  > wrote:
>
> Check R FAQ 7.10: How do I convert factors to numeric?
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
>
> On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
>  > Dear R listers,
>  >
>  > I am trying to compute the mean of a dummy variable that is
> encoded as a
>  > factor. However, even though the levels of my factor are 0 - 1,
> when I
>  > compute the mean (after coercing the factor to be
>  > numeric), R changes 0 into 1 and 1 into yes, thus altering my
> expected
>  > result.
>  >
>  > Please, consider the following working example:
>  > pp <- rep(0:1, 10)
>  > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
>  > mean(pp) #this won't work because the argument is not numeric or
> logical
>  > mean(as.integer(pp)) # this computes the average, but not on the
> range 0-1,
>  > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
>  >
>  > What am I doing wrong?
>  > Thanks in advance for your kind support,
>  > f.
>  >
>  >
>
> --
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus University Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478 
> Fax: +31/(0)10/7043014 
> Web: http://www.erasmusmc.nl/biostatistiek/
>
>
>
>
> --
> Francesco Sarracino, Ph.D.
> https://sites.google.com/site/fsarracino/

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread Francesco Sarracino
Dear Dimitris,

thanks for your quick reply. I've tried the solutions proposed in 7.10 How
do I convert factors to numeric?

as.numeric(as.character(pp))
and
as.numeric(levels(pp))[as.integer(pp)]

However, whatever I do, I get "Warning message: NAs introduced by coercion"
and the output is a vector of NA.

Any ideas?
f.



On 23 January 2013 10:39, D. Rizopoulos  wrote:

> Check R FAQ 7.10: How do I convert factors to numeric?
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
>
> On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
> > Dear R listers,
> >
> > I am trying to compute the mean of a dummy variable that is encoded as a
> > factor. However, even though the levels of my factor are 0 - 1, when I
> > compute the mean (after coercing the factor to be
> > numeric), R changes 0 into 1 and 1 into yes, thus altering my expected
> > result.
> >
> > Please, consider the following working example:
> > pp <- rep(0:1, 10)
> > pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
> > mean(pp) #this won't work because the argument is not numeric or logical
> > mean(as.integer(pp)) # this computes the average, but not on the range
> 0-1,
> > but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
> >
> > What am I doing wrong?
> > Thanks in advance for your kind support,
> > f.
> >
> >
>
> --
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus University Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
> Web: http://www.erasmusmc.nl/biostatistiek/




-- 
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with coercing a factor to be numeric

2013-01-23 Thread D. Rizopoulos
Check R FAQ 7.10: How do I convert factors to numeric?


I hope it helps.

Best,
Dimitris


On 1/23/2013 10:33 AM, Francesco Sarracino wrote:
> Dear R listers,
>
> I am trying to compute the mean of a dummy variable that is encoded as a
> factor. However, even though the levels of my factor are 0 - 1, when I
> compute the mean (after coercing the factor to be
> numeric), R changes 0 into 1 and 1 into yes, thus altering my expected
> result.
>
> Please, consider the following working example:
> pp <- rep(0:1, 10)
> pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
> mean(pp) #this won't work because the argument is not numeric or logical
> mean(as.integer(pp)) # this computes the average, but not on the range 0-1,
> but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.
>
> What am I doing wrong?
> Thanks in advance for your kind support,
> f.
>
>

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problems with coercing a factor to be numeric

2013-01-23 Thread Francesco Sarracino
Dear R listers,

I am trying to compute the mean of a dummy variable that is encoded as a
factor. However, even though the levels of my factor are 0 - 1, when I
compute the mean (after coercing the factor to be
numeric), R changes 0 into 1 and 1 into yes, thus altering my expected
result.

Please, consider the following working example:
pp <- rep(0:1, 10)
pp <- factor(pp, levels=(0:1), labels=c("no","yes"))
mean(pp) #this won't work because the argument is not numeric or logical
mean(as.integer(pp)) # this computes the average, but not on the range 0-1,
but 1-2. Indeed, the result is 1.5 and not 0.5 as expected.

What am I doing wrong?
Thanks in advance for your kind support,
f.


-- 
Francesco Sarracino, Ph.D.
https://sites.google.com/site/fsarracino/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.