tail mrna_Y5-2.fq +GAII_0001:6:91:210:549#0/1 ZBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB @GAII_0001:6:91:210:814#0/1 CCCCTCTCTCCAAGCTGGAGGAGCTGAAGGCTCATGANAN +GAII_0001:6:91:210:814#0/1 `_`a_aZa`a_``_^`__P``ZBBBBBBBBBBBBBBBBBB @GAII_0001:6:91:210:160#0/1 CTCGCGAAGCTTCTCTGGAGGAGAGTGATGTACGATGNCN +GAII_0001:6:91:210:160#0/1 a__a_a__a_ba]abbabXa__a_BBBBBBBBBBBBBB boyce-162-119:mRNA_monocyte jdhahbi$
________________________________ From: Martin Morgan <mtmor...@fhcrc.org> Cc: bioc-sig-sequencing@r-project.org Sent: Thu, March 24, 2011 9:39:42 AM Subject: Re: [Bioc-sig-seq] readFastq() error On 03/24/2011 09:41 AM, joseph wrote: > I added a new line character at the end of the file > echo >> reads.fq > I got the same numbers when I repeated the analysis you indicated that there were 16509910 reads in the file, and the test indicates its the last read that causes problems, so what does the last read look like? e.g., tail reads.fq Martin > > > ------------------------------------------------------------------------ > *From:* Martin Morgan <mtmor...@fhcrc.org> > *Cc:* bioc-sig-sequencing@r-project.org > *Sent:* Wed, March 23, 2011 7:44:40 PM > *Subject:* Re: [Bioc-sig-seq] readFastq() error > > On 03/23/2011 05:49 PM, joseph wrote: > > Hi Martin > > here is what I got: > > x = readLines('~/myDir/reads.fq') > > rd = x[c(FALSE, TRUE, FALSE, FALSE)] > > qual = x[c(FALSE, FALSE, FALSE, TRUE)] > > > which(nchar(rd) != nchar(qual)) > > [1] 16509910 > > # that is all the reads in the file > > # When I tried to count the reads with the same number of characters, I > > also got all the reads > > > length(which(nchar(rd) == nchar(qual))) > > [1] 16509909 > > I suspect there is a missing end-of-line on the last line of the file. > > > > Joseph > > > > > > > > ------------------------------------------------------------------------ > > *From:* Martin Morgan <mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>> > > *Cc:* bioc-sig-sequencing@r-project.org > <mailto:bioc-sig-sequencing@r-project.org> > > *Sent:* Wed, March 23, 2011 4:21:25 PM > > *Subject:* Re: [Bioc-sig-seq] readFastq() error > > > > On 03/23/2011 04:07 PM, Martin Morgan wrote: > > > On 03/23/2011 03:58 PM, joseph wrote: > > >> Hello > > >> How would you fix a FASTQ file that gives the following error when > > >> read with > > >> readFastq()? > > >> Other lanes from the same flow cell are imported fine with > readFastq(). > > >> > > >> rfq = readFastq("~/myDir", pattern="reads.fq") > > >> Error: Input/Output > > >> file(s): > > >> ~/myDir/reads.fq > > >> message: IncompatibleTypes > > >> message: invalid class "ShortReadQ" object: some sread and quality > > widths > > >> differ > > >> > > > > > > you could read the file in > > > > > > x = readLines('~/myDir/reads.fq') > > > > > > split it into reads and qualities > > > > > > rd = x[c(FALSE, TRUE, FALSE, FALSE)] > > > qual = x[c(FALSE, FALSE, TRUE, FALSE)] > > > > oops, x[c(FALSE, FALSE, FALSE, TRUE)] > > > > > > > > and ask which have different numbers of characters > > > > > > which(nchar(rd) != nchar(qual)) > > > > > > Martin > > > > > >> head reads.fq > > >> @GAII_0001:6:1:0:101#0/1 > > >> NCTCANCATTGTTTGGACGGAACAAAACCGGGGACAATCT > > >> +GAII_0001:6:1:0:101#0/1 > > >> BX[_\B_VXGQQU]]]YTPMGWTZZTVQ_X[TGYPZG[WZ > > >> @GAII_0001:6:1:0:123#0/1 > > >> NGTGANTCNGCTCATTGCGAGTTTTAACCTTTTCTCTATC > > >> +GAII_0001:6:1:0:123#0/1 > > >> BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB > > >> @GAII_0001:6:1:0:168#0/1 > > >> NCCAGNCCCAGCAGCCCTTCCTTTTCCCTGCTTACCCTCA > > >> > > >> > > >> > > >> [[alternative HTML version deleted]] > > >> > > >> _______________________________________________ > > >> Bioc-sig-sequencing mailing list > > >> Bioc-sig-sequencing@r-project.org > <mailto:Bioc-sig-sequencing@r-project.org> > > <mailto:Bioc-sig-sequencing@r-project.org > <mailto:Bioc-sig-sequencing@r-project.org>> > > >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > > > > > > > > > > > -- > > Computational Biology > > Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > > > Location: M1-B861 > > Telephone: 206 667-2793 > > > > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing