Bioc-devel: I haven been developing Bioconductor Package for multiple sample peak calling, and all unit test for my packages is done efficiently. However, I have one minor problem that cause memory inefficiency when building the packages in my machines. To get straight, I am going to find overlap for multiple GRanges objects simultaneously and proceed joint analysis for multiple ChIP-Seq sample to rescue weak enriched region by helping with co-localized evidence of multiple GRanges . After I reviewed all my source code, indeed some paired overlap repeated many times that cause unnecessary memory usage. This is my custom function that I developed, it works perfectly in my current workflow, but cause memory inefficiency problem.
grs <- GRangeslist(gr1, gr2, gr3, gr4, ...) overlap <- function(grs, idx=1L, FUN=which.min) { chosen <- grs[[idx]] que.hit <- as(findOverlaps(chosen), "List") sup.hit <- lapply(grs[-idx], function(ele_) { ans <- as(findOverlaps(chosen, ele_), "List") out.idx0 <- as(FUN(extractList(ele_$p.value, ans)), "List") out.idx0 <- out.idx0[!is.na(out.idx0)] ans <- ans[out.idx0] }) res <- c(list(que.hit), sup.hit) return(res) } How can I optimize my custom function without memory inefficiency? How can I get rid of repeated overlapped paired GRanges? How can I efficiently solve this issue? Can anyone propose possible ideas to get through this problem? Thanks a lot -- Jurat Shahidin Ph.D. candidate Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel