nealrichardson commented on pull request #11904:
URL: https://github.com/apache/arrow/pull/11904#issuecomment-995991878
I'll read through the PR, but first some high-level questions.
> This PR:
>
> * Defines `register_translation()` and `register_translation_agg()`
instead of direct assignment into `nse_funcs` and `agg_funcs`. This enables
attaching a package name when the function is being registered (e.g.,
`register_translation("stringr::str_dup", function(x, ...) ...)`) and makes it
possible in the future to allow other packages to define translations and/or
for us to change how we evaluate translated expressions without changing many
function assignments.
Why is attaching a package name useful?
I see the value of providing an extension point to users/other package
developers.
I don't understand what you mean by allowing "us to change how we evaluate
translated expressions without changing many function assignments".
> * Moves the registration of translations to package load time rather than
package build time. This enables splitting up translations into multiple files,
adds the usual CMD check that normal functions undergo (e.g., for use of
missing variables), and opens up the possibility of defining different
translations for different versions of packages (or omitting translations if a
package isn't installed).
Why is it a good idea to move this to package load time? Some might see that
as a negative if it is significant in run time. It also doesn't seem like it is
necessary to move the assignment to load time, is it? You could have the
registry function but still assign at build time.
We can split up translations into multiple files without this change.
Why is "the possibility of defining different translations for different
versions of packages" valuable?
Why is "omitting translations if a package isn't installed" valuable?
> * Splits up the definition of translations into dplyr-funcs-type.R,
dplyr-funcs-conditional.R, dplyr-funcs-string.R, dplyr-funcs-datetime.R, and
dplyr-funcs-math.R. This matches where the translations are tested (the test
files were named test-dplyr-funcs-string.R, etc.). Some translations were moved
to dplyr-summarise.R because the translations were being tested in
test-dplyr-summarise.R. This makes it easier for parallel PRs defining
translations and makes it easier for new contributors to figure out where tests
should go.
> * Consolidates internal documentation on how to write translations into
one .Rd file rather than scattered around as comments in dplyr-functions.R as
they were before.
> * Removes direct references to `nse_funcs` and `agg_funcs` except where
used to implement evaluation
>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]