Dear all,

I got an overwhelming amount of response to my question, and making a
complete summary is not possible. However, I learned that I should change things like 'dat[5:8, 1]' to 'dat[[1]][5:8]',
respecting the fact that a data frame is a list, in my packages.
Accidentally, this also solves the problem of using tibbles where data frames are expected. The classes "tbl" and "tbl_df" will be preserved on output; no need to use 'as.data.frame', the simple check 'is.data.frame' is enough.

Thanks for a very interesting and enlightening discussion! A special thanks to Hadley for sharing great packages with us; I could only wish they were easier to use in my own packages;)

Göran

On 2017-09-26 15:37, Hadley Wickham wrote:
On Tue, Sep 26, 2017 at 2:30 AM, Göran Broström
<goran.brost...@umu.se> wrote:
I am beginning to get complaints from users of my CRAN packages
(especially 'eha') to the effect that they get error messages like
"Error: Unsupported use of matrix or array for column indexing".

It turns out that they are sticking in tibbles into functions that
expect data frames as input. And I am using the kind of subsetting
that Hadley dislikes (eha is an old package, much older than
tibbles). It is of course a simple matter to change the code so it
handles both data frames and tibbles correctly, but this affects
many functions, and it will take some time. And when the next guy
introduces 'troubles' as an improvement of 'tibbles', I will have
to rewrite the code again.

Changing df[, x] to df[[x]] is not very hard and makes your code easier to understand because it more clearly conveys the intent that you want a single column.

While I like Hadley's way of doing it, I think it is a mistake to
let a tibble also be of class data frame. To me it is a matter of
inheritance and backwards compability: A tibble should add nice
things to a data frame, not change basic behaviour, in order to
call itself a data frame.

Is it correct to let a tibble be of class "data.frame"?

If it not inherit from data frame, it would be not work with the 99% of functions that work with data frames and don't deliberately take advantage of the dropping behaviour of [. In other words, it would
be pointless.

I decided to make [.tibble type-stable (i.e. always return a data frame) because this behaviour causes substantial problems in real
data analysis code. I did it understanding that it would cause some
package developers frustration, but I think it's better for a handful
of package maintainers to be frustrated than hundreds of users
creating dangerous code.

Hadley


______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to