Am 27.08.24 um 11:55 schrieb peter dalgaard:
Yes. A quirk, rather than a bug I'd say. One issue is that the internal logic
of transform() relies on
e <- eval(substitute(list(...)), `_data`, parent.frame())
tags <- names(e)
so untagged entries in ... will not be included.
... unless at least one is tagged:
R> transform(BOD, 0:5, 1:6)
Time demand
1 1 8.3
2 2 10.3
3 3 19.0
4 4 16.0
5 5 15.6
6 7 19.8
R> transform(BOD, 0:5, 1:6, foo = 1)
Time demand 0:5 1:6 foo
1 1 8.3 0 1 1
2 2 10.3 1 2 1
3 3 19.0 2 3 1
4 4 16.0 3 4 1
5 5 15.6 4 5 1
6 7 19.8 5 6 1
But as transform.data.frame is only documented for tagged vector
expressions, all examples provided in this thread were formal misuses.
(It might make sense to warn about untagged entries.)
Personally, I'd be quite confused about what to expect from syntax like
transform(BOD, data.frame(y = 1:6))
as really no transformation is specified. Looks like cbind() or
data.frame() was meant.
Sebastian
The other part is a direct consequence of a quirk in data.frame:
data.frame(head(airquality), y=data.frame(x=rnorm(6)))
Ozone Solar.R Wind Temp Month Day x
1 41 190 7.4 67 5 1 0.3075402
2 36 118 8.0 72 5 2 0.7765265
3 12 149 12.6 74 5 3 0.3909341
4 18 313 11.5 62 5 4 0.4733170
5 NA NA 14.3 56 5 5 -0.6947709
6 28 NA 14.9 66 5 6 0.1126040
whereas (the wisdom of this escapes me)
data.frame(head(airquality), y=data.frame(x=rnorm(6),z=rnorm(6)))
Ozone Solar.R Wind Temp Month Day y.x y.z
1 41 190 7.4 67 5 1 -0.9250228 0.46483406
2 36 118 8.0 72 5 2 -0.5035793 0.28822668
...
On the whole, I think that transform was never designed (nor documented) to
take data frame arguments, so caveat emptor.
- Peter
On 24 Aug 2024, at 16:41 , Gabor Grothendieck <ggrothendi...@gmail.com> wrote:
One oddity in transform that I recently noticed. It seems that to include
a one-column data frame in the arguments one must name it even though the
name is ignored. If the data frame has more than one column then it must
also be named but in that case it is not ignored and the names are made up of
a combination of that name and the data frame's names. I would have thought
that if we did not want a combination of names we would just not name the
argument.
# ignores second argument returning BOD unchanged
transform(BOD, data.frame(y = 1:6)) |> names()
## [1] "Time" "demand"
# ignores second argument returning BOD unchanged
transform(BOD, data.frame(y = 1:6, z = 6:1)) |> names()
## [1] "Time" "demand"
# with one column in data frame it adds the column and names it y ignoring x
transform(BOD, x = data.frame(y = 1:6)) |> names()
## [1] "Time" "demand" "y"
# with multiple columns in data frame it uses x.y and x.z as names
transform(BOD, data.frame(y = 1:6, z = 6:1)) |> names()
## [1] "Time" "demand" "x.y" "x.z"
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel