Thank you for the follow up, I'm still learning my way through the SNP analysis.
I checked again the vcf specifications and the bcftools annotate instruction. At this point I am not sure I understand: does each tag (i.e. WebId, LocusTag, etc) need to be in a different column of my tab-delimited file? Regards, Max Massimiliano S. Tagliamonte Graduate Student University of Florida College of Veterinary Medicine Department of Infectious Diseases and Pathology ________________________________________ From: Petr Danecek <[email protected]> Sent: Friday, November 6, 2015 5:35 AM To: Tagliamonte,Massimiliano S Cc: John Marshall; [email protected] Subject: Re: [Samtools-help] bcftools annotate could not parse header line Hi Massimiliano, your FEATURE tag is defined as neither INFO nor FORMAT tag, please check the VCF specification http://samtools.github.io/hts-specs/ Best wishes, Petr On Thu, 2015-11-05 at 15:58 +0000, Tagliamonte,Massimiliano S wrote: > OK, sorry to bother again. > > I replaced all the underscores, but now I am getting 'The tag "FEATURE" is > not defined in my_file.tab.gz' > > This is my command: > > bcftools annotate -a my_file.tab.gz \ > -c CHROM,FROM,TO,FEATURE \ > -h bcftools_annots.hdr \ > -O v -o ./filtering/my_snps_bcftools_annotated.vcf \ > my_snps.vcf.gz > > The tab file has no header, and only 4 columns (chrom name, gene start , gene > end, annotation ('FEATURE') column. I have checked the instructions on > http://www.htslib.org/doc/bcftools.html#annotate but I am not sure what I am > doing wrong. This is the tab file first line: > > Pf3D7_01_v3 29510 37126 > ID=PF3D7_0100100;Name=PF3D7_0100100;description=erythrocyte+membrane+protein+1%2C+PfEMP1+%28VAR%29;size=7617;WebId=PF3D7_0100100;LocusTag=PF3D7_0100100;size=7617;Alias=VAR-UPSB1,124505645,MAL1P4.01,VAR,PF3D7_0100100,7670005,PFA0005w > > Thanks again for your time and kind attention, > Max > > > Massimiliano S. Tagliamonte > Graduate Student > University of Florida > College of Veterinary Medicine > Department of Infectious Diseases and Pathology > > > ________________________________________ > From: Tagliamonte,Massimiliano S > Sent: Thursday, November 5, 2015 9:50 AM > To: John Marshall > Cc: [email protected] > Subject: Re: [Samtools-help] bcftools annotate could not parse header line > > Great, I'll replace the underscores then. > > Thanks for your help, > Max > > Massimiliano S. Tagliamonte > Graduate Student > University of Florida > College of Veterinary Medicine > Department of Infectious Diseases and Pathology > > ________________________________________ > From: John Marshall <[email protected]> > Sent: Thursday, November 5, 2015 6:47 AM > To: Tagliamonte,Massimiliano S > Cc: [email protected] > Subject: Re: [Samtools-help] bcftools annotate could not parse header line > > On 4 Nov 2015, at 21:25, Tagliamonte,Massimiliano S <[email protected]> > wrote: > > I am trying to add an annotation column to my vcf file, after calling > > variants with the Samtools 1.x pipeline. I am using bcftools annotate, but > > I keep getting the same error regarding one of the headers: > > Could not parse the header line: > > "##FEATURE=<web_id=STRING_TAG,Number=1,Type=STRING,Description="PF3D7_0100100">" > > It's complaining about the underscore in your "web_id" key. Prior to VCF > v4.3, the spec gave no hints about what characters might be in INFO et al > field keys [1], and somewhat unfortunately htslib/bcftools allowed for only > letters and digits. This has been relaxed on the develop branch in GitHub > [2] and underscores and (non-leading) dots will be accepted by the next > bcftools release. > > In the meantime, you could either build htslib and bcftools from the > development branches in their GitHub repositories, or remove the underscores > from your web_id and locus_tag to get this to work with bcftools 1.2. > > John > > [1] In the v4.3 spec, see ยง1.6.1/8 > [2] > https://github.com/samtools/htslib/commit/30fb9eee41953958923c56f7ea0af5a5b0376b94 > > -- > The Wellcome Trust Sanger Institute is operated by Genome Research > Limited, a charity registered in England with number 1021457 and a > company registered in England with number 2742969, whose registered > office is 215 Euston Road, London, NW1 2BE. > > ------------------------------------------------------------------------------ > _______________________________________________ > Samtools-help mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/samtools-help -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. ------------------------------------------------------------------------------ _______________________________________________ Samtools-help mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/samtools-help
