Hi Timothy,and thanks for the answer. Loops where exactly what I was trying to avoid as much as possible. My initial idea was that that once I had recursive indexes at my disposal (which were retrieved over recursive loops), I could simply use it in a similar manner as we do with indexes (that is 'do-all-at-once', like in 'x <- x[-idx.drop]'). But I think even though recursive indexes are nice, you can't get around looping, which I think in turn means that you constantly have to adapt your recursive index set to the most recent 'state' of your list.
In case you're interested, in the attachment you'll find my current solution ('listDuplicatesProcess.txt' including the example script 'listDuplicatesProcess_examples.txt'). It builds on some other code, so you'd have to source 'flatten.txt' and 'envirToList' as well.
Regards, Janko On 23.05.2011 14:23, Timothy Bates wrote:
Dear Janko, I think requires a for loop. The approach I took here was mark the dups, then dump them all in one hit: testData = expand.grid(letters[1:4],c(1:3)) testData$keep=F uniqueIDS = unique(testData$Var1) for(thisID in uniqueIDS) { firstCaseOnly = match(thisID,testData$Var1) testData[firstCaseOnly,"keep"]=T } (testData = testData[testData$keep==T,]) On 23 May 2011, at 11:59 AM, Janko Thyson wrote:Dear list, I'm trying to solve something pretty basic here, but I can't really come up with a good solution. Basically, I would just like to remove duplicated named elements in lists via a their respective recursive indexes (given that I have a routine that identifies these recursive indexes). Here's a little example: # VECTORS # Here, it's pretty simple to remove duplicated entries y<- c(1,2,3,1,1) idx.dupl<- which(duplicated(y)) y<- y[-idx.dupl] # / # LISTS x<- list(a=list(a.1.1=1, a.1.1=2, a.1.1=3)) x[[c(1,1)]] x[[c(1,2)]] # Should be removed. x[[c(1,3)]] # Should be removed. # Let's say a 'checkDuplicates' routine would give me: idx.dupl<- list(c(1,2), c(1,3)) # Remove first duplicate: x[[idx.dupl[[1]]]]<- NULL x # Problem: # Once I remove the first duplicate, my duplicate index would have to be # updated as well as there is not third element anymore. x[[idx.dupl[[2]]]]<- NULL # So something like this would not work: sapply(idx.dupl, function(x.idx){ x[[x.idx]]<<- NULL }) # / Sorry if I'm missing something totally obvious here, but do you have any idea how to solve this? Thanks a lot, Janko ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
#' Coerce Environment to List (Recursively). #' #' Recursively coerces an \code{environment} to a \code{list}. #' #' @param src A an \code{environment} that should be coerced. #' @param order.type A \code{character} vector (length: max=1) specifying if #' the list elements should be ordered and if so, what type of ordering should #' be applied. #' @param ... Further args. #' @return A named \code{list} that corresponds to the recursively coerced #' initial \code{environment}. #' @callGraphPrimitives #' @author Janko Thyson \email{janko.thyson.rstuff@@googlemail.com} #' @seealso \code{\link{flatten}} #' @example inst/examples/envirAsList.R envirToList <- function( src, order.type=c("increasing", "decreasing", "none"), ... ){ if(length(order.type) > 1){ order.type <- order.type[1] } if(class(src) == "environment"){ envir <- new.env() src <- as.list(src) # LOOP OVER ELEMENTS out <- lapply(seq(along=src), function(x.src){ envir$names <- c(envir$names, names(src[x.src])) # RECURSIVE FLATTENING out <- envirToList(src[[x.src]]) return(out) }) names(out) <- envir$names if(order.type == "increasing"){ idx.order <- order(names(out)) out <- out[idx.order] } if(order.type == "decreasing"){ idx.order <- order(names(out), decreasing=TRUE) out <- out[idx.order] } # / } else { out <- src } return(out) }
#' Flatten (Nested) Lists or Environments. #' #' Flatten \code{lists} or \code{environments} according to specifications #' made via arg \code{start.after} and/or arg \code{stop.at}. When keeping #' the defaults, the function will traverse arg \code{src} (if \code{src} is #' an \code{environment}, it is coerced to a \code{list} #' via \code{\link{envirToList}} first) to retrieve the values at the #' respective bottom layers/bottom elements. These values are arranged in a #' named \code{list} where the respective names can be interpreted as the #' the paths to the retrieved values. See examples. #' #' @param src A named (arbitrary deeply nested) \code{list} or an #' \code{environment} that should be flattened. #' @param start.after An \code{integer} specifying the layer after which to #' start the flattening. \code{NULL} means to start at the very top. See #' examples. #' @param stop.at An \code{integer} specifying the layer at which to stop #' the flattening. \code{NULL} means there is no stop criterion. #' @param delim.path A \code{character} (length: 1) specifying how the names of #' the resulting flattened list should be pasted. #' @param .do.debug If \code{TRUE}, print information that might be helpful #' for debugging. #' @param ... Further args. #' @return A named \code{list} that features the desired degree of flattening. #' @callGraphPrimitives #' @author Janko Thyson \email{janko.thyson.rstuff@@googlemail.com} #' @seealso \code{\link{envirToList}} #' @example inst/examples/flatten.R flatten <- function( src, start.after=NULL, stop.at=NULL, .delim.path="/", .delim.index="-", do.index=FALSE, do.original=TRUE, do.warn=TRUE, .do.debug=FALSE, ... ){ #--------------------------------------------------------------------------- # VALIDATE #--------------------------------------------------------------------------- if(!is.list(src) & !is.environment(src)){ stop("Arg 'src' must be a 'list' or an 'environment'.") } if(!is.null(start.after) & !is.null(stop.at)){ if(start.after == 1& stop.at == 1){ msg <- c( "Invalid specification:", paste("* start.after: ", start.after, sep=""), paste("* stop.at: ", stop.at, sep="") ) stop(cat(msg, sep="\n")) } } # /VALIDATE ---------- #--------------------------------------------------------------------------- # INNER FUNCTIONS #--------------------------------------------------------------------------- .startAfterInner <- function( envir, nms, out.1, do.reset=FALSE, ... ){ .do.debug <- envir$.do.debug idx.diff <- diff(c(envir$start.after, length(envir$counter))) if(.do.debug){ cat(c("", "+++", ""), sep="\n") # print("+++") cat("names:", sep="\n") print(names(out.1)) cat("envir$counter:", sep="\n") print(envir$counter) cat("idx.diff:", sep="\n") print(idx.diff) } # UPDATE IF DEGREE OF NESTEDNESS EXCEEDS START CRITERION if(idx.diff > 0){ idx.cutoff <- ( length(envir$counter)-idx.diff+1):length(envir$counter ) idx.left <- envir$counter[-idx.cutoff] nms.1 <- nms[idx.cutoff] names(out.1) <- paste(nms.1, collapse="/") # UPDATE SRC idx.append <- sapply(envir$history, function(x.hist){ all(idx.left == x.hist) }) if(.do.debug){ cat("idx.cutoff:", sep="\n") print(idx.cutoff) cat("idx.left:", sep="\n") print(idx.left) cat("idx.append:", sep="\n") print(idx.append) cat("names remaining:", sep="\n") print(names(out.1)) } if(any(idx.append)){ envir$src[[idx.left]] <- append(envir$src[[idx.left]], values=out.1) } else { envir$src[[idx.left]] <- out.1 # UPDATE HISTORY envir$history <- c(envir$history, list(idx.left)) } envir$out <- envir$src # / } if(idx.diff < 0){ envir$out <- envir$src } # / # RESET if(do.reset){ envir$nms <- envir$nms[-length(envir$nms)] envir$counter <- envir$counter[-length(envir$counter)] } # / return(TRUE) } .updateOutInner <- function( envir, out.1, do.reset=FALSE, ... ){ .do.debug <- envir$.do.debug # UPDATE OUT out.0 <- get("out", envir = envir) out <- c(out.0, out.1) envir$out <- out # / # RESET if(do.reset){ envir$nms <- envir$nms[-length(envir$nms)] envir$counter <- envir$counter[-length(envir$counter)] } # / return(TRUE) } .flattenInner <- function( src, envir, ... ){ .do.debug <- envir$.do.debug if( (class(src)=="list" & length(src) != 0) | (class(src) == "environment" & length(src) != 0) ){ if(class(src) == "environment"){ src <- as.list(src) } # UPDATE envir$counter.history <- c(envir$counter.history, list(envir$counter)) # EXIT IF DEGREE EXCEEDS CUTOFF if(!is.null(envir$stop.at)){ if(length(envir$counter) > envir$stop.at){ # THIS nms <- get("nms", envir=envir) path.nms <- paste(envir$nms, collapse=envir$.delim.path) if(.do.debug){ cat("names:", sep="\n") print(path.nms) } out.1 <- list(src) names(out.1) <- path.nms # / # DECISION ON FLATTENING if(!is.null(envir$start.after)){ .startAfterInner(envir=envir, nms=nms, out.1=out.1, do.reset=TRUE) return(NULL) # } # / } else { .updateOutInner(envir=envir, out.1=out.1, do.reset=TRUE) return(NULL) } } } # / # LOOP OVER ELEMENTS for(i in seq(along=src)){ # UPDATE COUNTER envir$counter <- c(envir$counter, i) # UPDATE NAMES # assign("nms", c(get("nms", envir=envir), names(src[i])), envir=envir) envir$nms <- c(get("nms", envir=envir), names(src[i])) path.nms <- paste(envir$nms, collapse=envir$.delim.path) # UPDATE INDEX idx.append <- !(path.nms %in% names(envir$index)) index.1 <- list(data.frame( name=path.nms, index=paste(envir$counter, collapse=envir$.delim.index), is.top=length(envir$counter) == 1, is.bottom=FALSE, degree=length(envir$counter), duplicate=!idx.append, class=class(src[[i]]), type=typeof(src[[i]]), length=length(src[[i]]), dim={ if(is.null(dim(src[[i]]))){ NA } else { paste(dim(src[[i]]), collapse=" ") } }, stringsAsFactors=FALSE )) # index.1$is.bottom <- rslt names(index.1) <- path.nms envir$index <- c(envir$index, index.1) # / # RECURSIVE FLATTENING rslt <- .flattenInner(src[[i]], envir) # call recursively envir$index[[length(envir$index)]]$is.bottom <- rslt # RESET COUNTER if(i == length(src)){ envir$nms <- envir$nms[-length(envir$nms)] envir$counter <- envir$counter[-length(envir$counter)] } # / # return(FALSE) } # / return(TRUE) } else { # THIS nms <- get("nms", envir=envir) path.nms <- paste(envir$nms, collapse=envir$.delim.path) if(.do.debug){ cat("names:", sep="\n") print(path.nms) } out.1 <- list(src) names(out.1) <- path.nms # / # DECISION ON FLATTENING if(!is.null(envir$start.after)){ .startAfterInner(envir=envir, nms=nms, out.1=out.1) } else { .updateOutInner(envir=envir, out.1=out.1) } # / if(.do.debug){ cat("out.1:", sep="\n") print(out.1) } # UPDATE INDEX idx.append <- !(path.nms %in% names(envir$index)) if(idx.append){ index.1 <- list(data.frame( name=path.nms, index=paste(envir$counter, collapse=envir$.delim.index), is.top=FALSE, is.bottom=TRUE, degree=length(envir$counter), class=class(src[[i]]), type=typeof(src[[i]]), length=length(src[[i]]), dim={ if(is.null(dim(src[[i]]))){ NA } else { paste(dim(src[[i]]), collapse=" ") } }, stringsAsFactors=FALSE )) names(index.1) <- path.nms envir$index <- c(envir$index, index.1) } # / # RESET envir$nms <- envir$nms[-length(envir$nms)] envir$counter <- envir$counter[-length(envir$counter)] # / return(TRUE) } # return(TRUE) } # /INNER FUNCTIONS ---------- #--------------------------------------------------------------------------- # ACTUAL PROCESSING #--------------------------------------------------------------------------- # COERCE TO LIST if(class(src) == "environment"){ src <- envirToList(src=src) } # / # PRESERVE ORIGINAL (just in case) src.0 <- src out <- list() # ENVIR envir <- new.env() envir$.do.debug <- .do.debug envir$counter <- NULL envir$counter.history <- NULL envir$.delim.index <- .delim.index envir$.delim.path <- .delim.path envir$do.warn <- do.warn envir$history <- NULL envir$index <- NULL envir$nms <- NULL envir$out <- list() envir$src <- src envir$start.after <- start.after stop.at.0 <- stop.at if(!is.null(stop.at)){ stop.at.0 <- stop.at if(stop.at == 1){ # OUT VALUE out <- envir$out out <- list( original=NULL, flat=src, index=list(raw=NULL, table=NULL), degree={ c( if(is.null(start.after)){ 0 } else { start.after }, if(is.null(stop.at.0)){ 0 } else { stop.at.0 } ) } ) # / if(do.original){ out$original <- src } # / # CLASS class(out) <- c("FlatList", class(out)) return(out) } else { stop.at <- stop.at - 1 } } envir$stop.at <- stop.at # / # APPLY INNER .flattenInner(src, envir) # WARNINGS if(envir$do.warn){ max.length <- max(sapply(envir$counter.history, length)) # if(!envir$do.block.warning){ if(!is.null(start.after)){ if(start.after > max.length){ warning(paste("Argument 'start.after=", start.after, "' exceeds maximum degree of sublayer nestedness (=", max.length, ").", sep="")) } } if(!is.null(stop.at)){ if(stop.at.0 > max.length){ warning(paste("Argument 'stop.at=", stop.at.0, "' exceeds maximum degree of sublayer nestedness (=", max.length, ").", sep="")) } } } # / # OUT VALUE out <- envir$out out <- list( original=NULL, flat=out, index=list(raw=NULL, table=NULL), degree={ c( if(is.null(start.after)){ 0 } else { start.after }, if(is.null(stop.at.0)){ 0 } else { stop.at.0 } ) } ) # / # PROCESS OUT if(do.index){ index.raw <- envir$index index.table <- do.call("rbind", index.raw) rownames(index.table) <- NULL out$index$raw <- index.raw out$index$table <- index.table } if(do.original){ out$original <- src } # / # CLASS class(out) <- c("FlatList", class(out)) # / # /ACTUAL PROCESSING ---------- return(out) }
#------------------------------------------------------------------------------- # SIMPLE LISTS #------------------------------------------------------------------------------- src <- list(DF=data.frame(A=c(1,2)), vec=c("a", "b")) src <- list(src,src) flatten(src) # /SIMPLE LISTS ---------- #------------------------------------------------------------------------------- # NESTED LISTS #------------------------------------------------------------------------------- src <- list(a=list(a.1=list(a.1.1=list(a.1.1.1=1, a.1.1.1=2, a.1.1.1=3), a.1.2=1, a.1.2=2, a.1.2=3), a.2=list(a.2.1=list())), b=NULL, c=data.frame(a=1:3, b=3:5) ) flatten(src) flatten(src, start.after=1) flatten(src, start.after=1, .do.debug=TRUE) flatten(src, start.after=2) flatten(src, start.after=3) flatten(src, start.after=4) flatten(src, stop.at=1) flatten(src, stop.at=2) flatten(src, stop.at=3) flatten(src, stop.at=4) flatten(src, start.after=1, stop.at=1) flatten(src, start.after=1, stop.at=2) flatten(src, start.after=1, stop.at=3) flatten(src, start.after=1, stop.at=4) flatten(src, start.after=2, stop.at=4) flatten(src, do.index=TRUE) flatten(src, do.index=TRUE, do.return.index=TRUE) # PROFILE require(microbenchmark) prfl <- microbenchmark(flatten(src)) median(prfl$time)/1000000000 prfl <- microbenchmark(flatten(src, do.index=TRUE)) median(prfl$time)/1000000000 prfl <- microbenchmark(flatten(src, do.index=TRUE, do.return.index=TRUE)) median(prfl$time)/1000000000 # / # /NESTED LISTS ---------- #------------------------------------------------------------------------------- # ENVIRONMENTS #------------------------------------------------------------------------------- envir <- new.env() envir$a <- new.env() envir$a$a.1 <- new.env() envir$a$a.1$a.1.1 <- new.env() envir$a$a.1$a.1.1$a.1.1.1 <- NA envir$a$a.1$a.1.2 <- 5 envir$a$a.2 <- new.env() envir$a$a.2$a.2.1 <- list() envir$b <- NULL envir$c <- data.frame(a=1:3, b=3:5) envirToList(src=envir) flatten(envir) flatten(envir, start.after=1) flatten(envir, start.after=1, .do.debug=TRUE) flatten(envir, start.after=2) flatten(envir, start.after=3) flatten(envir, start.after=4) flatten(envir, stop.at=1) flatten(envir, stop.at=2) flatten(envir, stop.at=3) flatten(envir, stop.at=4) flatten(envir, start.after=1, stop.at=1) flatten(envir, start.after=1, stop.at=2) flatten(envir, start.after=1, stop.at=3) flatten(envir, start.after=1, stop.at=4) flatten(envir, do.index=TRUE) flatten(envir, do.index=TRUE, do.return.index=TRUE) # PROFILE require(microbenchmark) prfl <- microbenchmark(flatten(envir)) median(prfl$time)/1000000000 prfl <- microbenchmark(flatten(envir, do.index=TRUE)) median(prfl$time)/1000000000 prfl <- microbenchmark(flatten(envir, do.index=TRUE, do.return.index=TRUE)) median(prfl$time)/1000000000 # / # /ENVIRONMENTS ----------
listDuplicatesProcess <- function( src, handle.duplicates=c("stop", "keep.original", "keep.first", "keep.last", "index.duplicates", "index.all"), do.return.flat=TRUE, .delim.index="-", .delim.path="/", ... ){ .buffer <- new.env() if(length(handle.duplicates) > 1){ handle.duplicates <- handle.duplicates[1] } # FLATTEN if(!("FlatList" %in% class(src))){ src.flat <- flatten( src=src, do.index=TRUE, .delim.index=.delim.index, .delim.path=.delim.path ) } else { src.flat <- src } # / if(!all(src.flat$degree == c(0,0))){ stop("Flatlist of degree c(0,0) required.") } .buffer$src.flat <- src.flat .buffer$index <- NULL idx.dupl <- which(src.flat$index$table$duplicate) src.flat.split <- split( .buffer$src.flat$index$table[idx.dupl,], f=.buffer$src.flat$index$table$name[idx.dupl] ) if(length(idx.dupl)){ if( !is.expression(handle.duplicates)){ if(handle.duplicates == "stop"){ stop("Duplicates identified.") } } JNK <- sapply(src.flat.split, function(x.spl){ if(!is.expression(handle.duplicates)){ idx.dupl <- which(.buffer$src.flat$index$table$name %in% x.spl$name) if(handle.duplicates == "index.duplicates"){ index.new <- list(.buffer$src.flat$index$table[idx.dupl[-1],]) names(index.new) <- unique(x.spl$name) .buffer$index <- c(.buffer$index, index.new) return(NULL) } if(handle.duplicates == "index.all"){ index.new <- list(.buffer$src.flat$index$table[idx.dupl,]) names(index.new) <- unique(x.spl$name) .buffer$index <- c(.buffer$index, index.new) return(NULL) } if(handle.duplicates == "keep.original"){ idx.keep <- 1 idx.dupl <- idx.dupl[-idx.keep] idx.drop <- which( names(.buffer$src.flat$flat) %in% .buffer$src.flat$index$table$name[idx.dupl] )[-idx.keep] } if(handle.duplicates == "keep.first"){ idx.keep <- 2 idx.dupl <- idx.dupl[-idx.keep] idx.drop <- which( names(.buffer$src.flat$flat) %in% .buffer$src.flat$index$table$name[idx.dupl] )[-idx.keep] } if(handle.duplicates == "keep.last"){ idx.keep <- length(idx.dupl) idx.dupl <- idx.dupl[-idx.keep] idx.drop <- which( names(.buffer$src.flat$flat) %in% .buffer$src.flat$index$table$name[idx.dupl] ) idx.drop <- idx.drop[-idx.keep] } } else { #handle.duplicates <- expression(.buffer$src.flat$index$table$class == "numeric") # TODO # Test more thoroughly. idx.dupl <- which(.buffer$src.flat$index$table$name %in% x.spl$name) idx.drop <- which( names(.buffer$src.flat$flat) %in% .buffer$src.flat$index$table$name[idx.dupl] ) idx.expr <- which(eval(handle.duplicates)) if(!length(idx.expr)){ msg <- c( "Duplicate processing criterion could not be matched:", paste("* handle.duplicates: '", handle.duplicates, "'", sep="") ) stop(cat(msg, sep="\n")) } idx.keep <- which(idx.dupl %in% idx.expr) if(!length(idx.keep)){ msg <- c( "Duplicate processing criterion could not be matched:", paste("* handle.duplicates: '", handle.duplicates, "'", sep="") ) stop(cat(msg, sep="\n")) } if(length(idx.keep) > 1){ msg <- c( "Duplicate processing criterion resulted in multiple matches:", paste("* handle.duplicates: '", handle.duplicates, "'", sep=""), paste("* Matches: ", paste(idx.dupl[idx.keep], collapse=" "), sep="") ) stop(cat(msg, sep="\n")) } idx.dupl <- idx.dupl[-idx.keep] idx.drop <- idx.drop[-idx.keep] } idx.dupl.1 <- strsplit( .buffer$src.flat$index$table$index[idx.dupl], split=.delim.index ) .buffer$counter <- 0 idx.dupl.1 <- lapply(idx.dupl.1, function(x){ idx <- as.numeric(x) idx[length(idx)] <- idx[length(idx)] - .buffer$counter .buffer$counter <- .buffer$counter + 1 return(idx) }) JNK <- sapply(idx.dupl.1, function(x.idx){ .buffer$src.flat$original[[x.idx]] <- NULL }) if(length(idx.drop)){ .buffer$src.flat$flat <- .buffer$src.flat$flat[-idx.drop] } .buffer$src.flat$index$raw <- .buffer$src.flat$index$raw[-idx.dupl] .buffer$src.flat$index$table <- .buffer$src.flat$index$table[-idx.dupl,] .buffer$src.flat$index$table$duplicate <- FALSE rownames(.buffer$src.flat$index$table) <- NULL # FIND RIGHT BOTTOM LAYER VALUES idx.bottom <- which( .buffer$src.flat$index$table$is.bottom & .buffer$src.flat$index$table$name == unique(x.spl$name) ) # / # UPDATE INDEX IN TABLE idx.index <- strsplit( .buffer$src.flat$index$table$index[idx.bottom], split="-" ) index.new <- sapply(idx.index, function(x){ x[length(x)] <- 1 paste(x, collapse=.delim.index) }) .buffer$src.flat$index$table$index[idx.bottom] <<- index.new # / # UPDATE INDEX IN RAW tmp.paths <- unique(.buffer$src.flat$index$table$name[idx.bottom]) JNK <- sapply(seq(along=tmp.paths), function(x){ .buffer$src.flat$index$raw[[tmp.paths[x]]]$index <<- index.new[x] }) # / }) } # RETURN VALUE if( handle.duplicates == "index.duplicates" | handle.duplicates == "index.all" ){ out <- .buffer$index } else { if(do.return.flat){ out <- .buffer$src.flat } else { out <- .buffer$src.flat$original } } # / return(out) }
src <- list(a=list(a.1=list(a.1.1=list(a.1.1.1=1, a.1.1.1=2, a.1.1.1=3), a.1.2=1, a.1.2=2, a.1.2=3), a.2=list(a.2.1=list())), b=NULL, c=data.frame(a=1:3, b=3:5)) listDuplicatesProcess( src=src, handle.duplicates="stop", do.return.flat=TRUE, .delim.index="-", .delim.path="/" ) #+++++ listDuplicatesProcess( src=src, handle.duplicates="keep.original", do.return.flat=TRUE, .delim.index="-", .delim.path="/" ) listDuplicatesProcess( src=src, handle.duplicates="keep.original", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) #+++++ listDuplicatesProcess( src=src, handle.duplicates="keep.first", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) #+++++ listDuplicatesProcess( src=src, handle.duplicates="keep.last", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) #+++++ listDuplicatesProcess( src=src, handle.duplicates="index.duplicates", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) listDuplicatesProcess( src=src, handle.duplicates="index.all", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) #+++++ src.flat <- flatten(src, do.index=TRUE) listDuplicatesProcess( src=src.flat, handle.duplicates="keep.original", do.return.flat=FALSE, .delim.index="-", .delim.path="/" ) #+++++ testData = expand.grid(letters[1:4],c(1:3)) testData$keep=F uniqueIDS = unique(testData$Var1) for(thisID in uniqueIDS) { firstCaseOnly = match(thisID,testData$Var1) testData[firstCaseOnly,"keep"]=T } (testData = testData[testData$keep==T,])
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.