On 27/10/2021 7:16 p.m., Chris Brien wrote:
Hi listers,
I have a package asremlPlus on CRAN that manipulates formulae. It uses the
keep.order argument, as in the stats::terms function, to allow control over the
order of terms in a model.
However, when stats::update.formula is used to update a formula, there is no
keep.order argument for update.formula. The result is that the updated formula
is always simplified to a sum of single terms and these terms are always
re-ordered. I have written a function that is able to update a formula and keep
the order. The following example illustrates the problem and my solution.
#Functions to keep.order when a formula is updated
update_keep_order <- function(object, ...) {
UseMethod("update_keep_order")
}
update_keep_order.formula <- function(old, new, keep.order = TRUE) {
tmp <- .Call(stats:::C_updateform, as.formula(old), as.formula(new))
formula(terms.formula(tmp, keep.order = keep.order, simplify = TRUE))
}
#Generate some factors and a formula
facs <- expand.grid(A=1:2, B=1:2, C=1:2, D=1:2)
form <- with(facs, formula(~ A*B + C*D))
#Update with update.formula
(upd <- update(form, ~ . - C, keep.order = TRUE))
#Update with update_keep_order.formula
(upd_keep <- update_keep_order(form, ~ . - C, keep.order = TRUE))
However, the function calls the undocumented function stats::C_updateform and
so cannot be added to my package.
Is there another solution to achieve this outcome that does not require this
undocumented function and so could be incorporated in an R package on CRAN?
Thanks in advance for any help in solving this issue.
I can't think of a short term solution. The long term solution is to
prepare and submit a patch to R to add the keep_order argument to
update.formula.
As a workaround, perhaps you could modify the result of update() to
restore the original order. This sounds messy, but if you already have
code to manipulate formulae, maybe it's mostly in place. The steps
would be:
Decompose the formula into a vector of terms, e.g. ~A + B + A:B + C +
D + C:D becomes
old <- expression(A, B, A:B, C, D, C:D)
Update the formula, and decompose the new formula in the same way.
So A + B + D + A:B + C:D becomes
new <- expression(A, B, D, A:B, C:D)
Match each term in the new vector to its location in the original
vector, and sort it.
index <- match(new, old)
sorted <- order(index)
new[sorted]
In your example, this gives expression(A, B, A:B, D, C:D) .
(You'll need to decide what to do with terms that weren't present in
the original. They'll have NA values in index. Presumably they go to
the end.)
Rebuild the formula after sorting.
Duncan Murdoch
______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel