Awesome! That worked very well, thanks a lot. P
On Wed, Jun 3, 2015 at 4:26 PM, Heng Li <[email protected]> wrote: > Bwa-mem -C simply copies FASTQ header comments to output SAM. Therefore, > the comment must be in the SAM tag format. In your case, you should change > the FASTQ header to (note the red part): > > @HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 cs:Z: > *1:N:0:TCCGGAGACGCTCTGA:CUSTOM:1:126:14:9.0:14:100:0:252:42:7:7:42:1.59:ATATGTATACATAT:.:.:126:1.0* > > ATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATAT > +HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 1:N:0:TCCGGAGACGCTCTGA > > /</<<F<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFF/FFFFFFFFFFFFF/BFFFFFFFFFFFFBBFBF<BFFF<B/B/FFF<BBFF# > @HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 cs:Z: > *1:N:0:TCCGGAGACGCTCTGA:CUSTOM:4:124:16:6.9:18:81:14:123:45:4:4:44:1.47:TATATATCATATATATGA:ATG:AT:126:0.488095238095* > > ATGTATATACATATATATGAATATATATTCATATATATGTATATACATATATATGTATATACATATATATGAATATATATATATGTATATACATATATATATATGAATATATATTCATATATATAT > +HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 1:N:0:TCCGGAGACGCTCTGA > > BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFF/FFFFFF</BFFFFFFFFFF//<BFFFFFFF/<<BBFFFFFF > > > Heng > > On Jun 3, 2015, at 10:19 AM, Piet Jones <[email protected]> wrote: > > Hi, > > I am in the process of analysing a set of fastq files (paired-end, human). > I annotate the reads with information, which is basically a long string of > numbers and short sequences, eg: > > @HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 > *1:N:0:TCCGGAGACGCTCTGA:CUSTOM:1:126:14:9.0:14:100:0:252:42:7:7:42:1.59:ATATGTATACATAT:.:.:126:1.0* > > ATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATAT > +HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 1:N:0:TCCGGAGACGCTCTGA > > /</<<F<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFF/FFFFFFFFFFFFF/BFFFFFFFFFFFFBBFBF<BFFF<B/B/FFF<BBFF# > @HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 > *1:N:0:TCCGGAGACGCTCTGA:CUSTOM:4:124:16:6.9:18:81:14:123:45:4:4:44:1.47:TATATATCATATATATGA:ATG:AT:126:0.488095238095* > > ATGTATATACATATATATGAATATATATTCATATATATGTATATACATATATATGTATATACATATATATGAATATATATATATGTATATACATATATATATATGAATATATATTCATATATATAT > +HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 1:N:0:TCCGGAGACGCTCTGA > > BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFF/FFFFFF</BFFFFFFFFFF//<BFFFFFFF/<<BBFFFFFF > > I have highlighted the annotation in bold, part of it is the standard > annotation such as read number (1 vs 2) and barcode. Now, when I align this > with bwa mem I use the "-C" flag, this prints this flag to the samfile > after each alignment, which is great, the problem is that samtools > complains about this format: > > -bash-3.2$ samtools view test_1_annotated.sam > [E::sam_parse1] unrecognized type > [W::sam_read1] parse error at line 95 > [main_samview] truncated file. > > Removing the annotation makes samtools run without an issue. I know the > annotation violates the samtools format according to the docs. So my > question is, is there some other way that I can incorporate this annotation > and tug it along with my analysis further downstream (such as filtering > using samtools etc)? > > Kind Regards, > Piet Jones > > ------------------------------------------------------------------------------ > _______________________________________________ > Samtools-help mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/samtools-help > > >
------------------------------------------------------------------------------
_______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
