As a mathematician by training (and a former practicing mathematician, both of which qualifications I rarely feel compelled to pull out of the closet), I have to agree with Michael's challenge to the original assertion about the "mathematical concept of sets".

Sets are collections of distinct objects (at least in Cantors' original naive definition) and do not have a notion of "duplicate values". In the modern axiomatic definition, one axiom is that "two sets are equal if and only if they contain the same members". To expand on Michael's example, the union of {1, 2} with {1, 3} is {1, 2, 3}, not {1, 2, 1, 3} since there is only one distinct object designated by the value "1".

A computer programming language could choose to use the ordered vector (or list) [1, 2, 1, 3] as an internal representation of the union of [1,2], and [1,3], but it would then have to work hard to perform every other meaningful set operation. For instance, the cardinality of the union still has to equal three (not four, which is the length of the list), since there are exactly three distinct objects that are members. And, as Michael points out, the set represented by [1,2,3] has to be equal to the set represented by [1,2,1,3] since they contain exactly the same members.

  Kevin

On 2/6/2014 9:39 PM, R. Michael Weylandt wrote:
On Thu, Feb 6, 2014 at 8:31 PM, Carl Witthoft <c...@witthoft.com> wrote:
First, let me apologize in advance if this is the wrong place to submit a
suggestion for a change to functions in the base-R package.  It never really
occurred to me that I'd have an idea worthy of such a change.

My idea is to provide an upgrade to all the "sets" tools (intersect, union,
setdiff, setequal) that allows the user to apply them in a strictly
algebraic style.

The current tools, as well documented, remove duplicate values in the input
vectors.  This can be helpful in stats work, but is inconsistent with the
mathematical concept of sets and set measure.
No comments about back-compatability concerns, etc. but why do you
think this is closer to the "mathematical concept of sets"? As I
learned them, sets have no repeats (or order) and other languages with
set primitives tend to agree:

python> {1,1,2,3} == {1,2,3}
True

I believe C++ calls what you're looking for a multiset (albeit with a
guarantee or orderedness).

Cheers,
Michael

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to