Re: [R] Question about a perceived irregularity in R syntax

2010-07-23 Thread Duncan Murdoch

On 23/07/2010 7:14 AM, Duncan Murdoch wrote:

Nordlund, Dan (DSHS/RDA) wrote:
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of Peter Dalgaard
>> Sent: Thursday, July 22, 2010 3:13 PM
>> To: Pat Schmitz
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Question about a perceived irregularity in R syntax
>>
>> Pat Schmitz wrote:
>> 
>>> Both vector query's can select the values from the data.frame as
>>>   
>> written,
>> 
>>> however in the first form assigning a value to said selected numbers
>>>   
>> fails.
>> 
>>>  Can you explain the reason this fails?

>>>
>>> dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
>>>
>>> dat$Value[dat$Value == "NA"] <- 1 #Why does this  fails to work,
>>> dat$Value[dat$Value %in% NA] <- 1 #While this does work?
>>>
>>>
>>> #Particularly when str() results in an equivalent class
>>> dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
>>> str(dat$Value[dat$Value %in% NA])
>>> str(dat$Value[dat$Value == "NA"])
>>>   
>> 1. NA and "NA" are very different things

>> 2. checkout is.na() and its help page
>>
>>
>> 
>
> I also would have suggested is.na to do the replacement.  What surprised me was that 
>
> dat$Value[dat$Value %in% NA] <- 1 
>
> actually worked.  I guess I always assumed that if 
>
>   
>> NA == NA
>> 
> [1] NA

>
> then an attempt to compare NA to elements in a vector would also return NA, 
but not so.
>
>   
>> NA %in% c(1,NA,3)
>> 
> [1] TRUE

>
>
> Learned something new today,

I suspect that's not intentional, though I'm not sure it should be 
fixed.  According to the usual convention the result should be a logical NA.


Oops, not true. The behaviour is clearly documented in ?match:

Exactly what matches what is to some extent a matter of
definition. For all types, ‘NA’ matches ‘NA’ and no other
value. For real and complex values, ‘NaN’ values are regarded
as matching any other ‘NaN’ value, but not matching ‘NA’.

Thanks to Brian Ripley (the author of that paragraph) for pointing this 
out to me. Not sure how I missed it on my first reading, but the fact 
that it preceded my morning coffee might be a contributing factor.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about a perceived irregularity in R syntax

2010-07-23 Thread Duncan Murdoch

Nordlund, Dan (DSHS/RDA) wrote:

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Peter Dalgaard
Sent: Thursday, July 22, 2010 3:13 PM
To: Pat Schmitz
Cc: r-help@r-project.org
Subject: Re: [R] Question about a perceived irregularity in R syntax

Pat Schmitz wrote:


Both vector query's can select the values from the data.frame as
  

written,


however in the first form assigning a value to said selected numbers
  

fails.


 Can you explain the reason this fails?

dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))

dat$Value[dat$Value == "NA"] <- 1 #Why does this  fails to work,
dat$Value[dat$Value %in% NA] <- 1 #While this does work?


#Particularly when str() results in an equivalent class
dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
str(dat$Value[dat$Value %in% NA])
str(dat$Value[dat$Value == "NA"])
  

1. NA and "NA" are very different things
2. checkout is.na() and its help page





I also would have suggested is.na to do the replacement.  What surprised me was that 

dat$Value[dat$Value %in% NA] <- 1 

actually worked.  I guess I always assumed that if 

  

NA == NA


[1] NA

then an attempt to compare NA to elements in a vector would also return NA, but 
not so.

  

NA %in% c(1,NA,3)


[1] TRUE


Learned something new today,


I suspect that's not intentional, though I'm not sure it should be 
fixed.  According to the usual convention the result should be a logical NA.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about a perceived irregularity in R syntax

2010-07-22 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Peter Dalgaard
> Sent: Thursday, July 22, 2010 3:13 PM
> To: Pat Schmitz
> Cc: r-help@r-project.org
> Subject: Re: [R] Question about a perceived irregularity in R syntax
> 
> Pat Schmitz wrote:
> > Both vector query's can select the values from the data.frame as
> written,
> > however in the first form assigning a value to said selected numbers
> fails.
> >  Can you explain the reason this fails?
> >
> > dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
> >
> > dat$Value[dat$Value == "NA"] <- 1 #Why does this  fails to work,
> > dat$Value[dat$Value %in% NA] <- 1 #While this does work?
> >
> >
> > #Particularly when str() results in an equivalent class
> > dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
> > str(dat$Value[dat$Value %in% NA])
> > str(dat$Value[dat$Value == "NA"])
> 
> 1. NA and "NA" are very different things
> 2. checkout is.na() and its help page
> 
> 

I also would have suggested is.na to do the replacement.  What surprised me was 
that 

dat$Value[dat$Value %in% NA] <- 1 

actually worked.  I guess I always assumed that if 

> NA == NA
[1] NA

then an attempt to compare NA to elements in a vector would also return NA, but 
not so.

> NA %in% c(1,NA,3)
[1] TRUE


Learned something new today,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about a perceived irregularity in R syntax

2010-07-22 Thread Peter Dalgaard
Pat Schmitz wrote:
> Both vector query's can select the values from the data.frame as written,
> however in the first form assigning a value to said selected numbers fails.
>  Can you explain the reason this fails?
> 
> dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
> 
> dat$Value[dat$Value == "NA"] <- 1 #Why does this  fails to work,
> dat$Value[dat$Value %in% NA] <- 1 #While this does work?
> 
> 
> #Particularly when str() results in an equivalent class
> dat <- data.frame(index = 1:10, Value = c(1:4, NA, 6, NA, 8:10))
> str(dat$Value[dat$Value %in% NA])
> str(dat$Value[dat$Value == "NA"])

1. NA and "NA" are very different things
2. checkout is.na() and its help page



-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.