Hello, I have start and end coordinates from different experiments (DNase hypersensitivity data) and now I would like to combine overlapping intervals. For instance (see my test data below) (2) 30-52 and (3) 49-101 are combined to 30-101. But 49-101 and 70-103 would not be combined because they are on different chromosomes (chr a and chr b). Does anybody have an idea? Thanks Hermann
> df chr start end 1 a 5 10 2 a 30 52 3 a 49 101 4 b 70 103 5 b 100 130 6 b 129 140 > dput (df) structure(list(chr = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a", "b"), class = "factor"), start = c(5, 30, 49, 70, 100, 129), end = c(10, 52, 101, 103, 130, 140)), .Names = c("chr", "start", "end"), row.names = c(NA, -6L), class = "data.frame") [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.