On 31/01/2021 3:57 p.m., Martin Møller Skarbiniks Pedersen wrote:
This is really puzzling me and when I try to make a small example
everything works like expected.
The problem:
I got these two large vectors of strings.
str(s1)
chr [1:766608] "0.dk" ...
str(s2)
chr [1:59387] "043.dk" "0606.dk" "0618.dk" "0888.dk" "0iq.dk" "0it.dk" ...
And I need to create the union-set of s1 and s2.
I expect the size of the union-set to be between 766608 and 766608+59387.
However it is 681193 which is less that number of elements in s1!
length(base::union(s1, s2))
[1] 681193
Any hints?
I imagine unique(s1) is shorter than s1. The union function is the same as
unique(c(s1, s2))
for your data. (The only difference is if s1 or s2 is named: the names
are dropped.)
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.