Re: [Bioc-sig-seq] transforming bam for TEQC

Martin Morgan Thu, 02 Jun 2011 05:22:47 -0700

On 06/02/2011 01:45 AM, David A. wrote:

Dear Martin, thanks a lot for your suggestion, but I am getting an error
with one of the samples. The other sample seems to load fine, so it
could be that this one is too large. I haven't found information about
this error, can you suggest something?


 > pars<-ScanBamParam(flag=scanBamFlag(isProperPair=TRUE),
what=c("rname","strand","pos","qwidth","seq","isize"))
 > data7<-scanBam('/Data/run5/aligned/s_7.bam',param=pars)[[1]]
Error in .io_bam(.scan_bam, file, index, reverseComplement, tmpl, param
= param) :
too many nucleotides, use 'param=ScanBamParam(which=<...>)'

Hi Dave -- this is an issue with Rsamtools that is on my radar toaddress. The problem is with the 'seq' argument, where the total numberof nucleotides exceeds the maximum integer R can represent (2^31 - 1).The workaround is either to omit 'seq' or to read the data in chunks(e.g., by chromosome, which="chr1") and concatenate (c(chr1rd, chr2rd);probably you'd do listOfRangedData = lapply(chrs, function(chr, ...) {<input chr to RangedData> }); rd = do.call(c, listOfRangedData).


Martin


It is prompting for using 'which' argument, I guess to select a part of
the file (alignment against hg19), but how can I deal with the BAM file
if I want to load it complete and then calculate the overall mean insert
size?



Thanks,

Dave

 > Date: Tue, 31 May 2011 17:12:22 -0700
 > From: mtmor...@fhcrc.org
 > To: dasol...@hotmail.com
 > CC: bioc-sig-sequencing@r-project.org
 > Subject: Re: [Bioc-sig-seq] transforming bam for TEQC
 >
 > On 05/31/2011 05:42 AM, David A. wrote:
 > >
 > > Hi, I would like to load my paired-end bam file for TEQC using the
 > > TEQC library. In the manual it says that the bed file needed for
 > > paired-end reads should contain read pair ID. How can I get this
 > > format? Some bam2bed converters I know only give the three main
 > > columns, and if I am not wrong the BEDPE format is too ample.
 >
 > Hi Dave -- I haven't used TEQC (looks good, though) but since its
 > get.reads function returns a RangedData object with mate pairs as
 > successive rows (from example(get.reads); reads) it seems like this
 > could be constructed directly from your bam file using
 > Rsamtools::scanBam and IRanges::RangedData. I think you'll start with
 > something like
 >
 > param <- ScanBamParam(flag=scanBamFlag(isProperPair=TRUE),
 > what=c("qname", "pos", "qwidth", "rname"))
 > aln = scanBam(fl, param=param)[[1]]
 > rd = with(aln, RangedData(IRanges(pos, width=qwidth), ID=qname,
 > space=rname))
 >
 > rd[order(rd$space, rd$ID)]
 >
 > Martin
 >
 > >
 > > Any help would be greatly appreciated
 > >
 > > Cheers,
 > >
 > > Dave [[alternative HTML version deleted]]
 > >
 > > _______________________________________________ Bioc-sig-sequencing
 > > mailing list Bioc-sig-sequencing@r-project.org
 > > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
 >
 >
 > --
 > Computational Biology
 > Fred Hutchinson Cancer Research Center
 > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
 >
 > Location: M1-B861
 > Telephone: 206 667-2793



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] transforming bam for TEQC

Reply via email to