Hi.

Follow-up to my previous email, vcftools is used to filter based on different quality annotations as shown below:

Usage:   vcfutils.pl varFilter [options] <in.vcf>

-Q INT    minimum RMS mapping quality for SNPs [10]
         -d INT    minimum read depth [2]
         -D INT    maximum read depth [10000000]
         -a INT    minimum number of alternate bases [2]
         -w INT    SNP within INT bp around a gap to be filtered [3]
         -W INT    window size for filtering adjacent gaps [10]
         -1 FLOAT  min P-value for strand bias (given PV4) [0.0001]
         -2 FLOAT  min P-value for baseQ bias [1e-100]
         -3 FLOAT  min P-value for mapQ bias [0]
         -4 FLOAT  min P-value for end distance bias [0.0001]
         -e FLOAT  min P-value for HWE (plus F<0) [0.0001]
         -p        print filtered variants

Any suggestions on which of these parameters and thresholds to use is what i am looking as a starting point.
On 09/10/14 13:13, mehar wrote:
Hi,

Thank you for your response.

I am dealing with dog genome which is a diploid organism and as big as human genome. We have both WGS and WES data, and struck with huge amount of variants in both the datasets and would like to do hard filtering to start off.

In the paper "http://arxiv.org/pdf/1404.0929.pdf"; certain filters which are applicable to a set of variant callers are choosen and applied to their datasets. However, any thresholds were not mentioned.

Would be valuable if someone can cite filters to be applied specific to samtools.

Regards
Mehar
On 08/10/14 18:20, Tim Fennell wrote:
Depending on a) whether you’re dealing with human, another diploid organism or something else and b) what kind of data you have (wgs, exome, other) you might start with Heng’s CHM1 paper as an interesting read:
http://arxiv.org/pdf/1404.0929.pdf

-t

On Oct 8, 2014, at 9:58 AM, mehar <[email protected] <mailto:[email protected]>> wrote:

Hi all,

Knowing the fact that filtering variants manually, using thresholds on quality values, is subject to all sorts of caveats i am writing this to seek some suggestion for hard filtering variants as it is better than nothing.

Could someone provide *generic recommendations* using samtools that should at least provide a starting point to analyse the data.****

Awaiting for suggestions!!

------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help



------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to