Dear all,
I have been trying to analyze some recently acquired WGS reads
(re-sequencing with MiSeq) but I am having problems with both Picard and
GATK tools and I don't know where the problem is.
My fastq reads are already in the sanger/illumina 1.9 format, as
recognized by the FastQC tool. I have modified the attributes of the
read files from fastq to fastqsanger and successfully performed a BWA
mapping against my reference sequence.
I have then filtered the resulting SAM file with "NGS: SAM Tools, Filter
SAM" to have only paired-mapped reads and reordered the file with "NGS:
Picard, Reorder SAM/BAM", allowing the option Truncate sequence names
after first whitespace.
Since my reads are highly duplicated (from the FastQC output), I have
run the "NGS: Picard, Mark Duplicate reads" tool, obtaining the removal
of only 2 duplicated reads. I went on adding a Read Group with "NGS:
Picard, Add or Replace Groups" and starting the SNP calling with GATK
using the tool Realigner Target Creator. And here I have obtained an
empty file and I have started thinking something is wrong.
So, I have tried to perform the mapping again (as suggested by the GATK
wiki when someone got an empty file like me), running the same steps on
different sample reads, but I have always the same strange results from
the De-duplication step and the Realigner tool.
I think there is something wrong during the BWA mapping step, or even in
my fastq reads, but I cannot understand what it is.
Any idea?
And what is the read quality format accepted by Galaxy tools? I know
it's the PHRED+33, but how does it look like?
Example 1:
??A????BDDDEDDDDGGGGGGGHHHF##77AEFHIIHIHIIIH##77ACFFHHHIHIIHH#5AEFHHHHHHF#55AFHEAEDHHHHHHFFCFHHH#######64#66=+@DDEGGGGDEDEEBEECCECEEGGEGGGGGGGGEEGGA5C0
or
Example 2:
!!"!!!!#%%%&%%%%((((((()))'!!!!"&')**)*)***)!!!!"$'')))*)**))!!"&'))))))'!!!"')&"&%))))))''$')))!!!!!!!!!!!!!!!%%&((((%&%&&#&&$$&$&&((&((((((((&&(("!$!
I did BWA mapping with both types and it worked, but maybe the problems
lies somewhere here.
I hope someone can help me!
Thank you!!!!
Debora
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/