Hi,
I'm currently working on a variety of SNP-calling pipelines to update
reference genomes, and have run into a serious yet seemingly simple problem
going from the SAM output of BWA-MEM to a sorted BAM for use with GATK.
I'm using BWA v0.7.12-r044 and samtools v1.2-99-ge2bb18f, with the
following bwa mem call:
bwa mem -t 30 -R '@RG\tID:1\tSM:1\tPL:ILLUMINA\tLB:1' [ref.fa]
[Illumina_SE_reads.fastq.gz] > [SAM file]

Based on the results of samtools view -h on the resultant SAM file, there
is indeed a valid @RG header, and there are RG:Z:1 tags in all of the
corresponding alignment records.  Both the header and alignment RG tags are
maintained during SAM->BAM conversion (samtools view -b), but sorting with
output as BAM or SAM results in the loss of the RG header, while the
alignment RG tags remain (thereby corrupting the SAM/BAM file, according to
GATK).

I've also noticed that the PG header is lost during the sorting procedure.
Is there a bug in the merging procedure that drops the @RG and @PG headers?

The bug does not occur with samtools sort in samtools 0.1.18, so I would
guess that it resides in htslib.

Thanks,
Patrick Reilly
------------------------------------------------------------------------------
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to