On 2012-07-15 10:01, Paulo Barata wrote:

Dear Peter,

Thank you. I will try to modify my programming habits.
But it seems there is a flaw in R, when it accepts a reference
to a non-existent variable inside a data frame with the df$var
notation. This should be corrected somehow.

Paulo Barata


Paulo,

I understand your concerns and I do think that the "best"
thing would be to excise the $ shortcut from the language
or, at least, make y$x equivalent to
y[["x", exact = TRUE]]. But, as has been pointed out
before, that might not be easy. Nevertheless, even y[["x"]]
may not be the ultimate panacea. Consider your own
example:

df <- data.frame(a = 1:3, b=11:13)
sum(df[["aaa"]] == 2)
#[1] 0

which results from

df[["aaa"]] == 2
#logical(0)

The safest extraction is y[ , "x"]:

sum(df[ , "aaa"] == 2)
#Error in `[.data.frame`(df, , "aaa") : undefined columns selected

But then, this comes down to whether one thinks that
addressing a nonexistent variable should result in an
error or should return NULL.

The bottom line probably is that the $ behaviour will not change
in the near future and one would simply be well advised to be
aware of its behaviour. Every language has its quirks. Just be
thankful that the R language isn't as big a mess as the English
language (which I do love dearly).

Peter Ehlers

---------------------------------------------------------------------


---------- Original Message -----------
From: Peter Ehlers<ehl...@ucalgary.ca>
To: Paulo Barata<paulo.bar...@ensp.fiocruz.br>
Cc: "r-help@r-project.org"<r-help@r-project.org>, peter dalgaard
<pda...@gmail.com>
Sent: Sun, 15 Jul 2012 09:29:11 -0700
Subject: Re: [R] variable (column) in a data frame

On 2012-07-15 08:41, Paulo Barata wrote:

Dr. Dalgaard,

Thank you. But pre-checking with is.null() or using with()
doesn't solve the problem of catching spelling mistakes
in the name of a variable inside a data frame, when using
the df$var notation often in a program.

Is there some way for R to behave, in relation to a variable
inside a data frame, the same way it behaves for a variable
not in a data frame? For example:

##----------------------------------------
a<- c(1,2,3)

## the variable exists, we get a correct answer
a==1

## the variable does not exist, R rightly points this out
aaa==1
##----------------------------------------

My point is, if we make a spelling mistake in a program when referring
to a variable inside a data frame, using the df$var notation,
there seems to be no way of getting warned about that.

You could wean yourself from the $-habit. It's convenient but can
lead to the problems you're experiencing (and this has been
discussed before). For programming, if you're prone to make
spelling errors, you should prefer df[, "aaa"]. See ?Extract.

Peter Ehlers


Thank you once again.

Paulo Barata

---------------------------------------------------------------------


---------- Original Message -----------
From: peter dalgaard<pda...@gmail.com>
To: "Paulo Barata"<paulo.bar...@ensp.fiocruz.br>
Sent: Sun, 15 Jul 2012 16:47:35 +0200
Subject: Re: [R] variable (column) in a data frame

On Jul 15, 2012, at 16:30 , Paulo Barata wrote:


To the R help list,

When using a data frame, there is no warning or error message
when I refer to a non-existent variable inside the data frame.

Example:

##----------------------------------------------

a<- c(1,2,3)
b<- c(11,22,33)
df<- data.frame(a,b)
df

## correct: there is a column in df named 'a'
## the sum is correctly performed
sum(df$a==2)

## incorrect: there is no column in df named 'aaa',
## but the sum is performed anyway without either warning or error
sum(df$aaa==2)

##----------------------------------------------

Is there some way to make R issue either a warning or an error
message in such a situation?


You can pre-check for is.null(df$aaa) or use with(df, sum(aaa==2)).

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
------- End of Original Message -------

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
------- End of Original Message -------


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to