Bwa-mem -C simply copies FASTQ header comments to output SAM. Therefore, the 
comment must be in the SAM tag format. In your case, you should change the 
FASTQ header to (note the red part):

> @HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 
> cs:Z:1:N:0:TCCGGAGACGCTCTGA:CUSTOM:1:126:14:9.0:14:100:0:252:42:7:7:42:1.59:ATATGTATACATAT:.:.:126:1.0
> ATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATAT
> +HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 1:N:0:TCCGGAGACGCTCTGA
> /</<<F<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFF/FFFFFFFFFFFFF/BFFFFFFFFFFFFBBFBF<BFFF<B/B/FFF<BBFF#
> @HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 
> cs:Z:1:N:0:TCCGGAGACGCTCTGA:CUSTOM:4:124:16:6.9:18:81:14:123:45:4:4:44:1.47:TATATATCATATATATGA:ATG:AT:126:0.488095238095
> ATGTATATACATATATATGAATATATATTCATATATATGTATATACATATATATGTATATACATATATATGAATATATATATATGTATATACATATATATATATGAATATATATTCATATATATAT
> +HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 1:N:0:TCCGGAGACGCTCTGA
> BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFF/FFFFFF</BFFFFFFFFFF//<BFFFFFFF/<<BBFFFFFF


Heng

On Jun 3, 2015, at 10:19 AM, Piet Jones <[email protected]> wrote:

> Hi,
> 
> I am in the process of analysing a set of fastq files (paired-end, human). I 
> annotate the reads with information, which is basically a long string of 
> numbers and short sequences, eg:
> 
> @HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 
> 1:N:0:TCCGGAGACGCTCTGA:CUSTOM:1:126:14:9.0:14:100:0:252:42:7:7:42:1.59:ATATGTATACATAT:.:.:126:1.0
> ATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATATATATGTATACATAT
> +HWI-D00595:74:C6DWYANXX:6:1101:1230:2158 1:N:0:TCCGGAGACGCTCTGA
> /</<<F<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFF/FFFFFFFFFFFFF/BFFFFFFFFFFFFBBFBF<BFFF<B/B/FFF<BBFF#
> @HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 
> 1:N:0:TCCGGAGACGCTCTGA:CUSTOM:4:124:16:6.9:18:81:14:123:45:4:4:44:1.47:TATATATCATATATATGA:ATG:AT:126:0.488095238095
> ATGTATATACATATATATGAATATATATTCATATATATGTATATACATATATATGTATATACATATATATGAATATATATATATGTATATACATATATATATATGAATATATATTCATATATATAT
> +HWI-D00595:74:C6DWYANXX:6:1101:1149:2167 1:N:0:TCCGGAGACGCTCTGA
> BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFF/FFFFFF</BFFFFFFFFFF//<BFFFFFFF/<<BBFFFFFF
> 
> I have highlighted the annotation in bold, part of it is the standard 
> annotation such as read number (1 vs 2) and barcode. Now, when I align this 
> with bwa mem I use the "-C" flag, this prints this flag to the samfile after 
> each alignment, which is great, the problem is that samtools complains about 
> this format:
> 
> -bash-3.2$ samtools view test_1_annotated.sam
> [E::sam_parse1] unrecognized type
> [W::sam_read1] parse error at line 95
> [main_samview] truncated file.
> 
> Removing the annotation makes samtools run without an issue. I know the 
> annotation violates the samtools format according to the docs. So my question 
> is, is there some other way that I can incorporate this annotation and tug it 
> along with my analysis further downstream (such as filtering using samtools 
> etc)?
> 
> Kind Regards,
> Piet Jones
> ------------------------------------------------------------------------------
> _______________________________________________
> Samtools-help mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/samtools-help

------------------------------------------------------------------------------
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to