Bert, > Turns out that there's an even faster alternative.
hah, but i'm *not* surprised. thanks for sharing the (current) low-price leader! and, thanks, again, to Kimmo for posting such a productive question! cheers, Greg ---- my apply user system elapsed 8.465 0.103 8.578 Bert's !duplicated user system elapsed 2.397 0.000 2.399 Bert's x[,2]>x[,1] user system elapsed 1.068 0.000 1.069 Bert's table()-based user system elapsed 0.235 0.000 0.235 my a.d.f(unique(cbind(do.call))) user system elapsed 4.470 0.000 4.475 Eric Berger's unique(...pmin...pmax) user system elapsed 0.820 0.017 0.837 Eric Berger's transmuting tibble... user system elapsed 0.936 0.000 0.938 Kimmo Elo's [OP] mutating paste user system elapsed 50.769 0.000 50.821 Rui Barradas' sort-based user system elapsed 42.001 0.053 42.112 ---- ---- n <- 1000 x <- expand.grid(Source = 1:n, Target = 1:n) cat("my apply\n") system.time({ y <- apply(x, 1, function(y) return (c(A=min(y), B=max(y)))) unique(t(y))}) # user system elapsed # 5.075 0.034 5.109 cat("Bert's !duplicated\n") system.time({ x[!duplicated(cbind(do.call(pmin, x), do.call(pmax, x))), ] }) # user system elapsed # 1.340 0.013 1.353 # Still more efficient and still returning a data frame is: cat("Bert's x[,2]>x[,1]\n") system.time({ w <- x[, 2] > x[,1] x[w, ] <- x[w, 2:1] unique(x)}) # user system elapsed # 0.693 0.011 0.703 cat("Bert's table()-based") f <- function(x){ w <- x[,2] > x[,1] x[w, ] <- x[w, 2:1] x$counts <- as.vector(table(x)) ## drop the dim x[x$counts>0, ] } system.time({ f(x) }) cat("my a.d.f(unique(cbind(do.call)))\n") system.time({ as.data.frame(unique(cbind(A=do.call(pmin,x), B=do.call(pmax,x)))) }) cat("Eric Berger's unique(...pmin...pmax)\n") system.time({ unique(data.frame(V1=pmin(x$Source,x$Target), V2=pmax(x$Source,x$Target))) }) cat("Eric Berger's transmuting tibble...\n") require(dplyr) xt<-tibble(x) system.time({ xt %>% transmute( a=pmin(Source,Target), b=pmax(Source,Target)) %>% unique() %>% rename(Source=a, Target=b) }) cat("Kimmo Elo's [OP] mutating paste\n") system.time({ xt %>% mutate(pair=mapply(function(x,y) paste0(sort(c(x,y)),collapse="-"), Source, Target)) %>% distinct(pair, .keep_all = T) %>% mutate(Source=sapply(pair, function(x) unlist(strsplit(x, split="-"))[1]), Target=sapply(pair, function(x) unlist(strsplit(x, split="-"))[2])) %>% select(-pair) }) cat("Rui Barradas' sort-based\n") system.time({ apply(x, 1, sort) |> t() |> unique() }) ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.