Hi Petr,

Thanks for the reply.
Strangely, I tried this but it gives me exactly the same result, i.e.,
overlapping indels are again included even if the position is outside the
interval.

How can this be? Does tabix somehow detect that this is a VCF and overrides
the tabix -s 1 -b 2 -e 2 setting?

I do:
rm filename.vcf.gz.tbi
tabix -s 1 -b 2 -e 2 filename.vcf.gz
tabix filename.vcf.gz 'CAE1:25-45'

And get positions starting with CAE1 12

Any ideas?

Thanks,
Hannes


On 19 May 2015 at 14:34, Petr Danecek <[email protected]> wrote:

> Hi Hannes,
>
> it does what you want and has no side effects other than that the index
> works by position only and does not take indels into account
>
> tabix -s 1 -b 2 -e 2
>
> Petr
>
> On Mon, 2015-05-18 at 22:19 +0200, Hannes Svardal wrote:
> > Hi,
> >
> >
> > I have a vcf with all sites produced by GATK.
> > I use bgzip to compress it and tabix 0.2.5 (r1005) to index it.
> > (tabix -p vcf filename.gz)
> >
> >
> > When I retrieve a region I do not only get the entries in this region
> > but also all entries with a position smaller than the desired start
> > position if the entry represents a deletion and the length of the
> > reference allele reaches the desired start position.
> >
> >
> > E.g. if I query for 'Chr1:25:45'
> >
> >
> > Chr1   12 .       AAAAAAAAACAAAAC A      ...
> > Chr1   19 .       AACAAAAC        A     ...
> > Chr1   20 .       ACAAAAC A,AAAAAC       ...
> > Chr1   23 .       AAAC    A     ...
> > Chr1   25 .       A       .      ...
> > ...
> > ...
> >
> >
> > Is there a way to only get entries for which the pos (column 2) is in
> > the interval?
> >
> >
> > I can imagine that the current behavior is sometimes desired, but it
> > is problematic in my case. The current behaviour means that if I use
> > tabix to split a VCF I will get duplicate entries on joining it again.
> >
> >
> > Could I just solve the problem by using tabix -s 1 -b 2 -e 2
> > <filename.gz>,  or does -p vcf do anything more sophisticated I should
> > be aware of?
> >
> >
> > Thanks for your help,
> > Hannes
> >
> >
> >
> >
> >
> >
> > --
> > Dr. Hannes Svardal
> > Postdoctoral researcher
> > Nordborg group
> >
> > Gregor Mendel Institute
> > Dr. Bohr-Gasse 3
> > 1030 Vienna, Austria
> > phone: +436803252197
> >
> >
> >
> ------------------------------------------------------------------------------
> > One dashboard for servers and applications across Physical-Virtual-Cloud
> > Widest out-of-the-box monitoring support with 50+ applications
> > Performance metrics, stats and reports that give you Actionable Insights
> > Deep dive visibility with transaction tracing using APM Insight.
> > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> > _______________________________________________ Samtools-help mailing
> list [email protected]
> https://lists.sourceforge.net/lists/listinfo/samtools-help
>
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to