Apologies if this has been covered already I haven't been following the whole 

Genome variant data is just a subset of genome data. My understanding is that 
the semweb BioHackathon group looked at a variety of different kinds of genomic 
data and came up with FALDO[1]. This model looks pretty good to me, and 
importantly there is a converter from GFF3[2,3]. Of all the commonly used 
genome feature formats out there, GFF3 is by far the best at encouraging 
provision of relevant metadata using standard ontologies/terminologies.

VCF is convertible to GVF[4,5] which is a subset of GFF3 with additional 
recommended metadata. It's supported by Ensembl, gbGap and others, and the 
1000genomes data is available in GVF[6].

As GFF3 is convertible to RDF/OWL that uses FALDO and SO, it follows that GVF 
is too (though the converter may need tweaking to take advantage of the 
additional GVF metadata).

I just wanted to make sure you were aware of all this previous work before 
reinventing anything.

[1] https://github.com/JervenBolleman/FALDO
[2] http://www.sequenceontology.org/gff3.shtml
[3] https://code.google.com/p/gff3-to-owl/
[4] http://www.ncbi.nlm.nih.gov/pubmed/20796305 - A standard variation file 
format for human genome sequences - Reese at al
[5] http://www.sequenceontology.org/resources/gvf.html
[6] ftp://ftp.ensembl.org/pub/current_variation/gvf/homo_sapiens/

On Apr 1, 2013, at 10:59 AM, Jeremy J Carroll wrote:

> Hi Kingsley,
> I wasn't going to but since you ask:
> http://www.slideshare.net/JeremyJCarroll/vcf-and-rdf
> or
> http://lists.w3.org/Archives/Public/www-archive/2013Apr/att-0002/W3C-JJC-LifeSci.pdf
> On Apr 1, 2013, at 10:13 AM, Kingsley Idehen <kide...@openlinksw.com> wrote:
>> On 4/1/13 1:05 PM, Jeremy J Carroll wrote:
>>> Hi
>>> I am hoping to present the work I am currently doing on VCF and RDF at the 
>>> Clinical Pharamcogenomics TF telecom on Wednesday.
>>> My presentation should cover:
>>> - business background, Syapse Discovery
>>> - some background on VCF as a knowledge representation format
>>> - and some initial results on mapping 1000 genomes into RDF
>>> I will circulate slides shortly
