On Tue, Jun 30, 2009 at 8:32 AM, Simon Anders <[email protected]> wrote:

> Hi Michael et al.
>
> Assume I have a list of AlignedRead objects, with data from several Solexa
> lanes. I would like to get a coverage vector over all the lanes.
>
> As 'coverage' takes only one AlignedRead object, I have two possibilities:
>
> (a) Concatenate the AlignedRead objects to a single big one. As far as I
> can see, 'rbind2' is not implemented for AlignedRead, and 'append' seemed
> very slow to me. It is probably not posible to do this without making a copy
> of all the data in memory.
>

I think this is what combineLanes() in the chipseq package does, and it
seems to work fine for us. Note that it takes a GenomeDataList, rather than
a list of AlignedReads, but ShortRead makes it easy to get a GenomeDataList,
if I remember right.

Of course, optimization is welcome.


> (b) Calculate the coverage for each AlignedRead object separately and add
> up the GenomeData objects. The `+` operator is not supported for GenomeData
> objects but it is for Rle objects. So I need to loop through the
> chromosomes.
>

> I've now written this short function for the purpose:
>
> sumUpCoverage <- function( lanes, seqLens )
> {
>   res <- NULL
>   for( i in 1:length(lanes) ) {
>      cvg <- coverage( lanes, width=seqLens )
>      if( is.null( res ) )
>         res <- cvg
>      else {
>         stopifnot( all( names(res) == names(cvg) ) )
>         for( seq in names(res) )
>            res[[seq]] <- res[[seq]] + cvg[[seq]]
>      }
>   }
>   res
> }
>
> This does not seem very elegant (and it takes 9 minutes, which, however,
> might be ok for 29 mio reads). Any idea how to do it better? (The use of
> 'for' instead of 'sapply' is on purpose: I hope to save memory that way.)
>
> And would it make sense to overload the `+` operator for GenomeData
> objects? If so, could I suggest adding this to BSgenome?
>
> Cheers
>  Simon
>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to