I suppose that it is far too late to offer such a suggestion, but it
seems to me that the problem is in some measure the mechanism of
inheritance.
First, the tibble (although the name is incomprehensible, why not
something like "data.blob") is superior to the bog standard R
data.frame.
This may n
Duncan's observation is correct. The background work to the standards
I worked on was a big effort, and the content was a lot smaller than R,
though possibly similar in scope to dealing with the current question.
The "voting" was also very late in the process, after the proposals
were developed, di
On 26/09/2017 4:52 PM, Jens Oehlschlägel wrote:
On 26.09.2017 15:37, Hadley Wickham wrote:
I decided to make [.tibble type-stable (i.e. always return a data
frame) because this behaviour causes substantial problems in real data
analysis code. I did it understanding that it would cause some pack
On 26 September 2017 at 22:52, Jens Oehlschlägel wrote:
| also the Rcppverse
Not really, in the context of this thread.
Rcpp does not impose or suggest a particular way of doing things at the R
level. Rcpp, really, is mostly about making it a little easier to interface
with C/C++ level code fro
Having been around a while and part of several programming language and
other standards (see ISO 6373:1984 and IEEE 754-1985), I prefer some democracy
at the
level of getting a standard. Though perhaps at the design level I can agree
with Hadley. However, we're now at the stage of needing to clean
> If that is right -- and I tend to believe it is right -- this change had
> better been done in R core and not on package level. I think the root of
> this evil is design inconsistencies of the language together with the lack
> of removing these inconsistencies. The longer we hesitated, the more
>
On Tue, Sep 26, 2017 at 12:15 PM, Patrick Perry wrote:
> Pro ignoring x[,1,drop=TRUE]:
> (1) it forces users to write consistent code for extracting a vector from a
> data frame
>
> Con:
> (1) functions that accept both matrices and data frames might break
> (x[[j]][i] doesn't work for a matrix)
On 26.09.2017 15:37, Hadley Wickham wrote:
I decided to make [.tibble type-stable (i.e. always return a data
frame) because this behaviour causes substantial problems in real data
analysis code. I did it understanding that it would cause some package
developers frustration, but I think it's bett
On Tue, Sep 26, 2017 at 10:40 AM, Joris Meys wrote:
> On Tue, Sep 26, 2017 at 5:33 PM, Hadley Wickham wrote:
>>
>> > I for one am happy this discussion pops up, because it's a piece of
>> > information I give to my students as well: convert to a data.frame when
>> > you
>> > start your analysis j
Pro ignoring x[,1,drop=TRUE]:
(1) it forces users to write consistent code for extracting a vector
from a data frame
Con:
(1) functions that accept both matrices and data frames might break
(x[[j]][i] doesn't work for a matrix)
(2) functions that use the access pattern x[i,j,drop = TRUE] will br
On Tue, Sep 26, 2017 at 5:33 PM, Hadley Wickham wrote:
> > I for one am happy this discussion pops up, because it's a piece of
> > information I give to my students as well: convert to a data.frame when
> you
> > start your analysis just to play safe. And this discussion shows why
> that is
> > -
> I for one am happy this discussion pops up, because it's a piece of
> information I give to my students as well: convert to a data.frame when you
> start your analysis just to play safe. And this discussion shows why that is
> -for the time being!- a good advice. The moment tibbles become the def
On Tue, Sep 26, 2017 at 9:22 AM, Patrick Perry wrote:
> Would it be possible to change tibbles so that
>
> x[,1,drop=TRUE]
>
> returns a vector, not a data frame? I certainly find it surprising that
> tibbles ignore
> the drop argument. If tibbles respeced the drop argument, then package
> develop
On 2017-09-26 15:37, Hadley Wickham wrote:
On Tue, Sep 26, 2017 at 2:30 AM, Göran Broström wrote:
I am beginning to get complaints from users of my CRAN packages (especially
'eha') to the effect that they get error messages like "Error: Unsupported
use of matrix or array for column indexing".
On Tue, Sep 26, 2017 at 3:38 PM, Hadley Wickham wrote:
>
> So we should never try and improve upon legacy behaviour? I don't
> understand what you're arguing for here. If a tibble didn't inherit
> from a data frame, it would be useless.
>
> Hadley
>
> --
> http://hadley.nz
>
I didn't say that. I
On Tue, Sep 26, 2017 at 8:51 AM, Pedro J. Aphalo
wrote:
> What I think is troublesome is that data.frame is part of the definition
> of the R language, and the expectation based on R's normal behaviour is
> that testing with is.data.frame() should be enough to ensure that an
> object can be treate
What I think is troublesome is that data.frame is part of the definition
of the R language, and the expectation based on R's normal behaviour is
that testing with is.data.frame() should be enough to ensure that an
object can be treated as a data frame. We can think of different
solutions for us
On Tue, Sep 26, 2017 at 8:35 AM, Joris Meys wrote:
> Where its parent class _sometimes_ returns an atomic vector and
>>
>> _sometimes_ returns a data frame.
>
> Indeed. And a tibble doesn't, so there's a conflict. Nobody said data.frame
> works better than tibble. Actually, we all agree that the l
On Tue, Sep 26, 2017 at 2:30 AM, Göran Broström wrote:
> I am beginning to get complaints from users of my CRAN packages (especially
> 'eha') to the effect that they get error messages like "Error: Unsupported
> use of matrix or array for column indexing".
>
> It turns out that they are sticking i
Where its parent class _sometimes_ returns an atomic vector and
> _sometimes_ returns a data frame.
>
> Hadley
>
Indeed. And a tibble doesn't, so there's a conflict. Nobody said data.frame
works better than tibble. Actually, we all agree that the legacy behaviour
sucks. But it exists, and causes
On Tue, Sep 26, 2017 at 8:28 AM, Jeroen Ooms wrote:
> On Tue, Sep 26, 2017 at 11:56 AM, Gábor Csárdi wrote:
>>
>> On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys wrote:
>> > I don't like the dropping of dimensions either. That doesn't change the
>> > fact that a tibble reacts different from a data.
On Tue, Sep 26, 2017 at 11:56 AM, Gábor Csárdi wrote:
>
> On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys wrote:
> > I don't like the dropping of dimensions either. That doesn't change the
> > fact that a tibble reacts different from a data.frame. So tibbles do not
> > inherit correctly from the cla
Thanks Gábor,
that is OK. However, if I would like an input tibble remain a tibble
(after massaging) in output, as a courtesy to the user, this will fail.
I think that it works if I instead treat the input as a list: That's all
'the tibble way' does (in my case at least).
Göran
On 2017-09-2
The problem is not with a data.frame or a tibble... the problem is when a
package unwittingly converts a data.frame/tibble to a vector, because of
bad defaults in data.frame methods, and then later on expects that vector
to be a vector without explicitly making it a vector or checking if it is a
ve
Yes, basically tibbles violate the substitution principle. A lot of
other packages do, probably base R as well, although it is sometimes
hard to say, because there is no clear object hierarchy.
Let's take a step back, and see how you can check for a data frame argument.
1. Weak check.
is.data.fr
On 2017-09-26 14:01, Daniel Lüdecke wrote:
You wrote:
The correct and logical way (which I use in 'eha') is to check if input is a
data frame, and if not, throw an error.
If you want to check for a data frame (and a data frame only), because you
don't want to coerce *any* object to data fra
Hi,
But the point is that
inherits(x, "data.frame", TRUE) == 1
will not distinguish between tibbles and other classes derived from
data.frame that do respect the original syntax. You cannot/want in most
cases block the use of every class derived from data.frame. One would
need to use a test t
You wrote:
The correct and logical way (which I use in 'eha') is to check if input is a
data frame, and if not, throw an error.
If you want to check for a data frame (and a data frame only), because you
don't want to coerce *any* object to data frames, then this would be one way to
check for d
2017-09-26 13:41 GMT+02:00 Holger Hoefling :
> Hi Thierry,
>
> You write:
>
> "If a package requires a data.frame, then it is up to the _user_ to
> provide a data.frame (and a tibble is not a data.frame). "
>
> Actually, as pointed out before, calling
>
> is.data.frame
>
> on a tibble returns TRUE.
Hi Thierry,
You write:
"If a package requires a data.frame, then it is up to the _user_ to
provide a data.frame (and a tibble is not a data.frame). "
Actually, as pointed out before, calling
is.data.frame
on a tibble returns TRUE. So I think that R says - yes, a tibble is a data
frame. What wo
Dear all,
IMHO the problem is being look at from the wrong perspective. The
tibble doesn't change the data.frame, it uses all methods from
data.frame which it doesn't implement itself. Hence it behaves like at
data.frame to some extent.
If a package requires a data.frame, then it is up to the _us
On Tue, Sep 26, 2017 at 11:56 AM, Gábor Csárdi
wrote:
>
> I have yet to see an OOP system in which a subclass cannot override the
> methods
> of its superclass. Not only is this in line with OOP paradigms, it is
> actually one of
> the essential OOP features.
>
Fair enough. And I shouldn't have
David is right,
imagine an old silly code such as:
get_a.data.frame <- function(d) if("data.frame" %in% class(d)) d["a" ,]
This line of code giving you the row "a" of a data.frame could be in any
package.
No matter how ugly it is, it is technically correct and conforms to the
original definition
There is no benefit. It is a rather cumbersome approach to checking whether
something behaves as you expect it to. `as.data.frame` will force it into
what you need; if it cannot be forced, then it will fail. That it can be
converted to a data.frame is the class' designers responsibility, not
yours.
These replies seem to be missing the point, which is that old code has to
be rewritten because tibbles don't behave like data frames.
It is true that subclasses can override behaviour, but there is an implicit
contract that the same methods should do the same things.
The as.xxx pattern seems weir
What is the benefit here, compared to just calling as.data.frame() on it?
Gabor
On Tue, Sep 26, 2017 at 11:11 AM, Daniel Lüdecke wrote:
> Since tibbles add their class attributes first, you could use:
>
> tb <- tibble(a = 5)
> inherits(tb, "data.frame", which = TRUE) == 1
>
> if "tb" is a data f
Since tibbles add their class attributes first, you could use:
tb <- tibble(a = 5)
inherits(tb, "data.frame", which = TRUE) == 1
if "tb" is a data frame (only), TRUE is returned, for tibble FALSE. You could
then coerce to data frame: as.data.frame(tb)
-Ursprüngliche Nachricht-
Von: R-pa
On 2017-09-26 11:56, Gábor Csárdi wrote:
On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys wrote:
I don't like the dropping of dimensions either. That doesn't change the
fact that a tibble reacts different from a data.frame. So tibbles do not
inherit correctly from the class data.frame, and it can
On 2017-09-26 11:35, Joris Meys wrote:
I don't like the dropping of dimensions either. That doesn't change the
fact that a tibble reacts different from a data.frame. So tibbles do not
inherit correctly from the class data.frame, and it can thus be argued
that it's against OOP paradigms to prete
On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys wrote:
> I don't like the dropping of dimensions either. That doesn't change the
> fact that a tibble reacts different from a data.frame. So tibbles do not
> inherit correctly from the class data.frame, and it can thus be argued that
> it's against OOP
I don't like the dropping of dimensions either. That doesn't change the
fact that a tibble reacts different from a data.frame. So tibbles do not
inherit correctly from the class data.frame, and it can thus be argued that
it's against OOP paradigms to pretend tibbles inherit from the class
data.fram
Thanks for the examples. Personally, I have been struck out multiple times
by data frames dropping dimensions, so I have a distaste for this dropping
behaviour.
Personally, I prefer data frame *not* to drop dimensions. They are not
arrays, where slicing drops a dimension makes sense because all en
Here's one difference:
atib <- tibble(a = 1:5, b = letters[5:1])
atib[3,"a"]
as.data.frame(atib)[3,"a"]
The second line returns a tibble (no dropping dimensions), the third line
does (dropping dimensions). Huge difference if you use [ , aColumn] to
select a vector from a data frame.
Cheers
Joris
Hej Stefan,
On 2017-09-26 10:57, Stefan McKinnon Høj-Edwards wrote:
Hi Göran,
Could you please elaborate on which kind of subsetting that Hadley dislikes?
I am yet to encounter operations on data frames that are not possible on
tribbles.
For instance, if 'dat' is a data frame, dat[1:3, 5] re
I could not agree more with Göran and we had to change code in our packages
because if this too. I also see students often facing bugs because of it.
Again, with all the respect I have for Hadley.
On 26 Sep 2017 9:32 a.m., "Göran Broström" wrote:
I am beginning to get complaints from users of m
Hi Göran,
Could you please elaborate on which kind of subsetting that Hadley dislikes?
I am yet to encounter operations on data frames that are not possible on
tribbles.
Kindly,
Stefan McKinnon Hoj-Edwards
Stefan McKinnon Høj-Edwards
ph.d. Genetics
+44 (0)776 231 2464
+45 2888 6598
Skype: stefan
I am beginning to get complaints from users of my CRAN packages
(especially 'eha') to the effect that they get error messages like
"Error: Unsupported use of matrix or array for column indexing".
It turns out that they are sticking in tibbles into functions that
expect data frames as input. An
47 matches
Mail list logo