Hi Hannes,

it does what you want and has no side effects other than that the index
works by position only and does not take indels into account

tabix -s 1 -b 2 -e 2

Petr

On Mon, 2015-05-18 at 22:19 +0200, Hannes Svardal wrote:
> Hi,
> 
> 
> I have a vcf with all sites produced by GATK.
> I use bgzip to compress it and tabix 0.2.5 (r1005) to index it.
> (tabix -p vcf filename.gz)
> 
> 
> When I retrieve a region I do not only get the entries in this region
> but also all entries with a position smaller than the desired start
> position if the entry represents a deletion and the length of the
> reference allele reaches the desired start position.
> 
> 
> E.g. if I query for 'Chr1:25:45'
> 
> 
> Chr1   12 .       AAAAAAAAACAAAAC A      ...
> Chr1   19 .       AACAAAAC        A     ...
> Chr1   20 .       ACAAAAC A,AAAAAC       ...
> Chr1   23 .       AAAC    A     ... 
> Chr1   25 .       A       .      ...
> ...
> ...
> 
> 
> Is there a way to only get entries for which the pos (column 2) is in
> the interval? 
> 
> 
> I can imagine that the current behavior is sometimes desired, but it
> is problematic in my case. The current behaviour means that if I use
> tabix to split a VCF I will get duplicate entries on joining it again.
> 
> 
> Could I just solve the problem by using tabix -s 1 -b 2 -e 2
> <filename.gz>,  or does -p vcf do anything more sophisticated I should
> be aware of?
> 
> 
> Thanks for your help,
> Hannes
> 
> 
> 
> 
> 
> 
> -- 
> Dr. Hannes Svardal
> Postdoctoral researcher
> Nordborg group
> 
> Gregor Mendel Institute
> Dr. Bohr-Gasse 3
> 1030 Vienna, Austria
> phone: +436803252197
> 
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________ Samtools-help mailing list 
> [email protected] 
> https://lists.sourceforge.net/lists/listinfo/samtools-help




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to