On 03/23/2011 05:49 PM, joseph wrote:
Hi Martin here is what I got: x = readLines('~/myDir/reads.fq') rd = x[c(FALSE, TRUE, FALSE, FALSE)] qual = x[c(FALSE, FALSE, FALSE, TRUE)] > which(nchar(rd) != nchar(qual)) [1] 16509910 # that is all the reads in the file # When I tried to count the reads with the same number of characters, I also got all the reads > length(which(nchar(rd) == nchar(qual))) [1] 16509909
I suspect there is a missing end-of-line on the last line of the file.
Joseph ------------------------------------------------------------------------ *From:* Martin Morgan <mtmor...@fhcrc.org> *To:* joseph <jdsan...@yahoo.com> *Cc:* bioc-sig-sequencing@r-project.org *Sent:* Wed, March 23, 2011 4:21:25 PM *Subject:* Re: [Bioc-sig-seq] readFastq() error On 03/23/2011 04:07 PM, Martin Morgan wrote: > On 03/23/2011 03:58 PM, joseph wrote: >> Hello >> How would you fix a FASTQ file that gives the following error when >> read with >> readFastq()? >> Other lanes from the same flow cell are imported fine with readFastq(). >> >> rfq = readFastq("~/myDir", pattern="reads.fq") >> Error: Input/Output >> file(s): >> ~/myDir/reads.fq >> message: IncompatibleTypes >> message: invalid class "ShortReadQ" object: some sread and quality widths >> differ >> > > you could read the file in > > x = readLines('~/myDir/reads.fq') > > split it into reads and qualities > > rd = x[c(FALSE, TRUE, FALSE, FALSE)] > qual = x[c(FALSE, FALSE, TRUE, FALSE)] oops, x[c(FALSE, FALSE, FALSE, TRUE)] > > and ask which have different numbers of characters > > which(nchar(rd) != nchar(qual)) > > Martin > >> head reads.fq >> @GAII_0001:6:1:0:101#0/1 >> NCTCANCATTGTTTGGACGGAACAAAACCGGGGACAATCT >> +GAII_0001:6:1:0:101#0/1 >> BX[_\B_VXGQQU]]]YTPMGWTZZTVQ_X[TGYPZG[WZ >> @GAII_0001:6:1:0:123#0/1 >> NGTGANTCNGCTCATTGCGAGTTTTAACCTTTTCTCTATC >> +GAII_0001:6:1:0:123#0/1 >> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB >> @GAII_0001:6:1:0:168#0/1 >> NCCAGNCCCAGCAGCCCTTCCTTTTCCCTGCTTACCCTCA >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> Bioc-sig-sequencing@r-project.org <mailto:Bioc-sig-sequencing@r-project.org> >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
-- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing