Correction: the gt[] values are coded and cannot be used as is. The
bcf_gt_allele() and bcf_gt_is_phased() macros can be used to get the 0/1
etc. values.

petr

On Tue, 2015-05-19 at 14:34 +0200, Petr Danecek wrote:
> Hi Zhihao,
> 
> the get_format function returns the total number of values, this is how
> it is used:
> 
> int ngts = bcf_get_format_int32(hdr, rec, "GT", &gt, &ngt_arr);
> ngts /= nsmpl;
> for (i=0; i<nsmpl; i++)
> {
>    for (j=0; j<ngts; j++) 
>        printf(" %d",gt[i*ngts+j]); // todo: check for missing 
>                                    // values and different ploidy
> }
> 
> 
> One can avoid using the C api completely and do instead
> 
> bcftools query -f'%CHROM\t%POS[\t%GT]\n' | ./my-script
> 
> Petr
> 
> 
> On Tue, 2015-05-19 at 12:20 +0100, Zhihao Ding wrote:
> > Dear samtools developers, 
> > 
> > I’ve been trying to learn htslib for some simple analyses. I’d
> > like to parse 1KG phase3 vcfs to obtain a haplotype matrix
> > consisting of 1s and 0s, for a given region. After reading 
> > comments in the header files, below is where I’ve got to. 
> > I haven’t got the matrix yet, but I am getting genotypes 
> > of samples one by one. 
> > 
> > Could I ask if I am on the right direction? Are there better 
> > ways of doing it? 
> > 
> > Any help would be very appreciated. 
> > 
> > Best wishes, 
> > Zhihao
> > 
> > 
> > 
> > 
> >   const char* vcf = “test/test.vcf.gz";
> > 
> >   int n    = 0;  // number of records in file
> >   int nsnp = 0;  // number of SNPs in file
> >   int ngt_arr = 0;
> >   int ngt     = 0;
> >   int *gt     = NULL;
> > 
> >         htsFile * inf = bcf_open(vcf, "r");
> >         if (inf == NULL) {
> >           return EXIT_FAILURE;
> >         }
> > 
> >         bcf_hdr_t *hdr = bcf_hdr_read(inf);
> >         fprintf(stderr, "File %s contains %i samples\n", vcf, 
> > bcf_hdr_nsamples(hdr));
> > 
> >         bcf1_t *rec = bcf_init();
> >       
> >         while (bcf_read(inf, hdr, rec) == 0) {
> >                 n++;
> >                 if (bcf_is_snp(rec)) {
> >                         nsnp++;
> >                 } else {
> >                         continue;
> >                 }
> > 
> >                 if ( bcf_get_format_int32(hdr, rec, "GT", &gt, &ngt_arr) > 
> > 0 ){
> >                   for (int i=0; i<bcf_hdr_nsamples(hdr); i++){
> >                     printf("chr%s\t%i\t%s\t%s\t%s\n",
> >                            seqnames[rec->rid],
> >                            rec->pos,
> >                            rec->d.allele[0],
> >                            rec->d.allele[bcf_gt_allele(gt[0])],
> >                            hdr -> samples[i]);
> >                   }
> >                 }
> >         }
> >         
> >         free(gt);
> >         bcf_hdr_destroy(hdr);
> >         bcf_close(inf);
> >         bcf_destroy(rec);
> > 
> > 
> > ------------------------------------------------------------------------------
> > One dashboard for servers and applications across Physical-Virtual-Cloud 
> > Widest out-of-the-box monitoring support with 50+ applications
> > Performance metrics, stats and reports that give you Actionable Insights
> > Deep dive visibility with transaction tracing using APM Insight.
> > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> > _______________________________________________
> > Samtools-help mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/samtools-help
> 
> 
> 
> 




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to