Dear all,

I have been trying to analyze some recently acquired WGS reads (re-sequencing with MiSeq) but I am having problems with both Picard and GATK tools and I don't know where the problem is.

My fastq reads are already in the sanger/illumina 1.9 format, as recognized by the FastQC tool. I have modified the attributes of the read files from fastq to fastqsanger and successfully performed a BWA mapping against my reference sequence. I have then filtered the resulting SAM file with "NGS: SAM Tools, Filter SAM" to have only paired-mapped reads and reordered the file with "NGS: Picard, Reorder SAM/BAM", allowing the option Truncate sequence names after first whitespace. Since my reads are highly duplicated (from the FastQC output), I have run the "NGS: Picard, Mark Duplicate reads" tool, obtaining the removal of only 2 duplicated reads. I went on adding a Read Group with "NGS: Picard, Add or Replace Groups" and starting the SNP calling with GATK using the tool Realigner Target Creator. And here I have obtained an empty file and I have started thinking something is wrong.

So, I have tried to perform the mapping again (as suggested by the GATK wiki when someone got an empty file like me), running the same steps on different sample reads, but I have always the same strange results from the De-duplication step and the Realigner tool. I think there is something wrong during the BWA mapping step, or even in my fastq reads, but I cannot understand what it is.

Any idea?

And what is the read quality format accepted by Galaxy tools? I know it's the PHRED+33, but how does it look like?

Example 1:

??A????BDDDEDDDDGGGGGGGHHHF##77AEFHIIHIHIIIH##77ACFFHHHIHIIHH#5AEFHHHHHHF#55AFHEAEDHHHHHHFFCFHHH#######64#66=+@DDEGGGGDEDEEBEECCECEEGGEGGGGGGGGEEGGA5C0

or
Example 2:

!!"!!!!#%%%&%%%%((((((()))'!!!!"&')**)*)***)!!!!"$'')))*)**))!!"&'))))))'!!!"')&"&%))))))''$')))!!!!!!!!!!!!!!!%%&((((%&%&&#&&$$&$&&((&((((((((&&(("!$!


I did BWA mapping with both types and it worked, but maybe the problems lies somewhere here.

I hope someone can help me!

Thank you!!!!
Debora
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

 http://galaxyproject.org/search/mailinglists/

Reply via email to