Inline below.

-- Bert

On Tue, Jul 17, 2012 at 7:40 AM, Paulo Barata
<paulo.bar...@ensp.fiocruz.br>wrote:

>
> Dear Frans and Peter,
>
> Yes, the notation df[,'var'] is able to catch a non-existent
> variable var inside a data frame df. But the notation df$var
> isn't.
>
> So we have this situation, where two different notations, which
> (as far as I understand) perform the same action, have different
> kinds of response.
>
> You don't understand far enough. Your assumption is simply not true. For
example, from ?"[" :

"The most important distinction between [, [[ and $ is that the [ can
select more than one element whereas the other two select a single element.

The default methods work somewhat differently for atomic vectors,
matrices/arrays and for recursive (list-like, see
is.recursive<http://127.0.0.1:25542/library/base/help/is.recursive>)
objects. $ is only valid for recursive objects, and is only discussed in
the section below on recursive objects."


So the Help page already notes that there are differences among them.

Nevertheless, your discomfort is, imo, understandable.
Extraction/replacement for data structures is a complex business, and R's
approach to the issues have "evolved" over time, with "inconsistencies,"
especially for edge cases, baked in. Because these issues are at the very
core of R's behavior, I think it likely that except for egregious
inconsistencies and outright bugs -- which at this point are most unlikely
to exist -- it is well nigh impossible to change them. I see no recourse
but to always check such edge cases carefully and to be as consistent as
possible in your own programming usage (e.g. always using [,".."] for
extracting columns). As Peter has pointed out several times, the $
extractor is convenient syntactic sugar that can get one into a lot of
trouble, and is probably best avoided.

Cheers,

Bert



> Couldn't this situation be fixed? Isn't it possible to make the
> df$var notation to issue an error when referring to a non-existent
> variable inside the data frame?
>
> Thank you very much.
>
> Paulo Barata
>
> ---------------------------------------------------------------------
>
>
> ---------- Original Message -----------
> From: "Frans Marcelissen" <frans.marcelis...@digipsy.nl>
> To: "'Paulo Barata'" <paulo.bar...@ensp.fiocruz.br>, <r-help@r-project.org
> >
> Sent: Mon, 16 Jul 2012 14:25:21 +0200
> Subject: RE: [R] variable (column) in a data frame
>
> > Hoi Pauli,
> > There is a difference between two ways of accessing columns in a matrex:
> > > df$aaa
> > NULL
> > > df["AAA"]
> > Error in `[.data.frame`(df, "AAA") : undefined columns selected
> > So df["AAA"] or df[,"AAA"] gives the error message you expect.
> > -------------------
> > Frans
> >
> > -----Oorspronkelijk bericht-----
> > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> > Namens Paulo Barata
> > Verzonden: zondag 15 juli 2012 16:31
> > Aan: r-help@r-project.org
> > Onderwerp: [R] variable (column) in a data frame
> >
> > To the R help list,
> >
> > When using a data frame, there is no warning or error message when I
> > refer to a non-existent variable inside the data frame.
> >
> > Example:
> >
> > ##----------------------------------------------
> >
> > a <- c(1,2,3)
> > b <- c(11,22,33)
> > df <- data.frame(a,b)
> > df
> >
> > ## correct: there is a column in df named 'a'
> > ## the sum is correctly performed
> > sum(df$a==2)
> >
> > ## incorrect: there is no column in df named 'aaa', ## but the sum is
> > performed anyway without either warning or error
> > sum(df$aaa==2)
> >
> > ##----------------------------------------------
> >
> > Is there some way to make R issue either a warning or an error
> > message in such a situation?
> >
> > I am using R version 2.15.1 64-bit on Windows 7 Professional.
> >
> > Thank you very much.
> >
> > Paulo Barata
> >
> > ---------------------------------------------------------------------
> > Paulo Barata
> >
> > ENSP - Fundação Oswaldo Cruz
> > Rua Leopoldo Bulhões 1480 - 8A
> > 21041-210  Rio de Janeiro - RJ
> > Brazil
> > E-mail: paulo.bar...@ensp.fiocruz.br
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> ------- End of Original Message -------
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to