Thanks very much Petr. This is very helpful!

Best, 
Zhihao


> On 19 May 2015, at 16:01, Petr Danecek <[email protected]> wrote:
> 
> Correction: the gt[] values are coded and cannot be used as is. The
> bcf_gt_allele() and bcf_gt_is_phased() macros can be used to get the 0/1
> etc. values.
> 
> petr
> 
> On Tue, 2015-05-19 at 14:34 +0200, Petr Danecek wrote:
>> Hi Zhihao,
>> 
>> the get_format function returns the total number of values, this is how
>> it is used:
>> 
>> int ngts = bcf_get_format_int32(hdr, rec, "GT", &gt, &ngt_arr);
>> ngts /= nsmpl;
>> for (i=0; i<nsmpl; i++)
>> {
>>   for (j=0; j<ngts; j++) 
>>       printf(" %d",gt[i*ngts+j]); // todo: check for missing 
>>                                   // values and different ploidy
>> }
>> 
>> 
>> One can avoid using the C api completely and do instead
>> 
>> bcftools query -f'%CHROM\t%POS[\t%GT]\n' | ./my-script
>> 
>> Petr
>> 
>> 
>> On Tue, 2015-05-19 at 12:20 +0100, Zhihao Ding wrote:
>>> Dear samtools developers, 
>>> 
>>> I’ve been trying to learn htslib for some simple analyses. I’d
>>> like to parse 1KG phase3 vcfs to obtain a haplotype matrix
>>> consisting of 1s and 0s, for a given region. After reading 
>>> comments in the header files, below is where I’ve got to. 
>>> I haven’t got the matrix yet, but I am getting genotypes 
>>> of samples one by one. 
>>> 
>>> Could I ask if I am on the right direction? Are there better 
>>> ways of doing it? 
>>> 
>>> Any help would be very appreciated. 
>>> 
>>> Best wishes, 
>>> Zhihao
>>> 
>>> 
>>> 
>>> 
>>>  const char* vcf = “test/test.vcf.gz";
>>> 
>>>  int n    = 0;  // number of records in file
>>>  int nsnp = 0;  // number of SNPs in file
>>>  int ngt_arr = 0;
>>>  int ngt     = 0;
>>>  int *gt     = NULL;
>>> 
>>>        htsFile * inf = bcf_open(vcf, "r");
>>>        if (inf == NULL) {
>>>          return EXIT_FAILURE;
>>>        }
>>> 
>>>        bcf_hdr_t *hdr = bcf_hdr_read(inf);
>>>        fprintf(stderr, "File %s contains %i samples\n", vcf, 
>>> bcf_hdr_nsamples(hdr));
>>> 
>>>        bcf1_t *rec = bcf_init();
>>> 
>>>        while (bcf_read(inf, hdr, rec) == 0) {
>>>                n++;
>>>                if (bcf_is_snp(rec)) {
>>>                        nsnp++;
>>>                } else {
>>>                        continue;
>>>                }
>>> 
>>>                if ( bcf_get_format_int32(hdr, rec, "GT", &gt, &ngt_arr) > 0 
>>> ){
>>>                  for (int i=0; i<bcf_hdr_nsamples(hdr); i++){
>>>                    printf("chr%s\t%i\t%s\t%s\t%s\n",
>>>                           seqnames[rec->rid],
>>>                           rec->pos,
>>>                           rec->d.allele[0],
>>>                           rec->d.allele[bcf_gt_allele(gt[0])],
>>>                           hdr -> samples[i]);
>>>                  }
>>>                }
>>>        }
>>> 
>>>        free(gt);
>>>        bcf_hdr_destroy(hdr);
>>>        bcf_close(inf);
>>>        bcf_destroy(rec);
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud 
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Samtools-help mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/samtools-help
>> 
>> 
>> 
>> 
> 
> 
> 
> 
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research 
> Limited, a charity registered in England with number 1021457 and a 
> company registered in England with number 2742969, whose registered 
> office is 215 Euston Road, London, NW1 2BE. 


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Samtools-help mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to