Many thanks for this example, which doesn't entirely cover my case since I have as many "indexes" entries as "sequences" entries. It was very educational none the less and I used it to come up with something a bit faster than what I had before. The main trick I used though was naming all entries in "sequences" and "indexes" likes so name(indexes) <- seq(length(indexes) and then do a lapply on "names(indexes)", which allows me to access both lists easily. What I end up with is this:
fragments <- lapply( names(indexes), function(x){ lapply( indexes[[x]], function(.range){ .range <- seq.int( .range[1], .range[2] ) unlist(lapply(sequences[x], '[', .range),use.names=FALSE) } ) } ) Although this is still quite slow, it's much faster than what I had before. Any further comments are highly welcome. I can send the real "sequences" and "indexes" as exported R objects ... Thanks, Joh jim holtman wrote: > Try this one; it is doing a list of 7000 in under 2 seconds: > >> sequences <- list( > + > + > + > c("M","G","L","W","I","S","F","G","T","P","P","S","Y","T","Y","L","L","I" > + ,"M", + > + > + > "N","H","K","L","L","L","I","N","N","N","N","L","T","E","V","H","T","Y","F", > "N","I","N","I","N","I","D","K","M","Y","I","H","*") > + ) >> >> >> >> indexes <- list( > + list( > + c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51) > + ) > + ) >> >> indexes <- rep(indexes,10) >> sequences <- rep(sequences,7000) >> >> system.time({ > + fragments <- lapply(indexes, function(.seq){ > + lapply(.seq, function(.range){ > + .range <- seq(.range[1], .range[2]) # save since we use several > times > + lapply(sequences, '[', .range) > + }) > + }) > + }) > user system elapsed > 1.24 0.00 1.26 >> >> > > > On Fri, Jan 16, 2009 at 3:16 PM, Johannes Graumann > <johannes_graum...@web.de> wrote: >> Thanks. Very elegant, but doesn't solve the problem of the outer "for" >> loop, since I now would rewrite the code like so: >> >> fragments <- list() >> for(iN in seq(length(sequences))){ >> cat(paste(iN,"\n")) >> fragments[[iN]] <- >> lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, >> as.list(g))]) >> } >> >> still very slow for length(sequences) ~ 7000. >> >> Joh >> >> On Friday 16 January 2009 14:23:47 Henrique Dallazuanna wrote: >>> Try this: >>> >>> lapply(indexes[[1]], function(g)sequences[[1]][do.call(seq, >>> as.list(g))]) >>> >>> On Fri, Jan 16, 2009 at 11:06 AM, Johannes Graumann < >>> >>> johannes_graum...@web.de> wrote: >>> > Hello, >>> > >>> > I have a list of character vectors like this: >>> > >>> > sequences <- list( >>> > >>> > >>> > c("M","G","L","W","I","S","F","G","T","P","P","S","Y","T","Y","L","L","I" >>> >,"M", >>> > >>> > >>> > "N","H","K","L","L","L","I","N","N","N","N","L","T","E","V","H","T","Y"," >>> >F", "N","I","N","I","N","I","D","K","M","Y","I","H","*") >>> > ) >>> > >>> > and another list of subset ranges like this: >>> > >>> > indexes <- list( >>> > list( >>> > c(1,22),c(22,46),c(46, 51),c(1,46),c(22,51),c(1,51) >>> > ) >>> > ) >>> > >>> > What I now want to do is to subset each entry in "sequences" >>> > (sequences[[1]]) with all ranges in the corresponding low level list >>> > in "indexes" (indexes[[1]]). Here is what I came up with. >>> > >>> > fragments <- list() >>> > for(iN in seq(length(sequences))){ >>> > cat(paste(iN,"\n")) >>> > tmpFragments <- sapply( >>> > indexes[[iN]], >>> > function(x){ >>> > sequences[[iN]][seq.int(x[1],x[2])] >>> > } >>> > ) >>> > fragments[[iN]] <- tmpFragments >>> > } >>> > >>> > This works fine, but "sequences" contains thousands of entries and the >>> > corresponding "indexes" are sometimes hundreds of ranges long, so this >>> > whole >>> > process is EXTREMELY inefficient. >>> > >>> > Does somebody out there take the challenge and show me a way on how to >>> > speed >>> > this up? >>> > >>> > Thanks for any hints, >>> > >>> > Joh >>> > >>> > ______________________________________________ >>> > R-help@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> > http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html and provide commented, >> minimal, self-contained, reproducible code. >> >> > > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.