On Wed, 22 Dec 2021, Ivan Krylov wrote:

On Sat, 18 Dec 2021 11:50:54 +0100
Arnaud FELD <arnaud.feldm...@gmail.com> wrote:

However, I'm a bit troubled about the "address" argument. What is it
intended for since (as far as I know) "address equality" is until now
something that isn't really let for the user to decide within R.

Using the words from "Extending R" by John M. Chambers, the concept of
address identity could be related to the question:

If some of the data in the object has changed, is this still the
same object?

Most objects in R are defined by their content. If you had a 100x100
matrix and changed an element at [50,50], it's now a different matrix,
even if it's stored in the same variable. If you create another 100x100
matrix in a different variable but fill it with the same numbers, it
should still compare equal to your original matrix.

Not all types of R objects are like that. Environments are good
candidates for pointer equality comparison. For example, the contents
of the global environment change every time you assign some variable in
the R command line, but it remains the same global environment. Indeed,
identical() for environments just compares their pointers: even if two
different environments only contain objects that compare equal, they
cannot be considered the same environment, because different closures
might be referring to them. Similar are data.tables: if you had a giant
dataset and, as part of cleaning it up, removed some outliers, perhaps
it should be considered the same dataset, even if the contents aren't
strictly the same any more. Same goes for reference class and R6
objects: unlike the pass-by-value semantics associated with most
objects in R, these are assumed to carry global state within them, and
modifications to them are reflected everywhere they are referenced, not
limited to the current function call.

This is still experimental and the 'address' option may not survive at
the R level. There are some C level applications where it can be
useful; maybe it will only be retained there.

I *think* that most (if not all) objects with reference semantics
already use pointer comparison when being compared by identical(), so
the default of "identical" is, as the help page says, almost always the
right choice, but if it matters to your code whether the objects are
actually stored in the same area in the memory, use hashes of type
"address".

Unfortunately not all: External pointer objects are reference objects
but by default are not compared based on object address. Fixing the
default is not an option in the short term as it breaks too much code
(mostly through dependencies on a few packages).

(Perhaps this topic could be a better fit for R-help.)

R-devel is the right place for this.

Best,

luke

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
241 Schaeffer Hall                  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to