[back onlist] On 12 March 2011 01:08, Joachim Baran <[email protected]> wrote: > > On 11-03-10 11:56 PM, "Peter Ansell" <[email protected]> wrote: >>If neither the key or name are going to be URI's, then it would be >>best to introduce another item to connect them using two RDF triples >>with a shared URI subject. > Yes, that sounds like a reasonable approach. I could introduce a URI > subject for each primary key in the mart tables (there is just one primary > key from which everything else decents), then use predicates for each > attribute in the mart and have the actual literal-values in the mart's > tables as objects. > > For example: > <biomart://dcc-dev.res.oicr.on.ca/pathway_config_1/kegg/238947> > <http://...#pathway_id> "hsa05200" > <biomart://dcc-dev.res.oicr.on.ca/pathway_config_1/kegg/238947> > <http://...#associated_gene_name> "TP53"
I didn't realise that the pathway_id (and other similar primary keys) would be published identifiers. If there are permanent identifiers provided by the dataset, such as "hsa05200" it is relatively easy to construct a URI using that, without resorting to adding an identifier like 238947 that may not be permanent. <biomart://dcc-dev.res.oicr.on.ca/pathway_config_1/kegg/kegg_pathway/hsa05200> <http://...#associated_gene_name> "TP53" The issue of how to construct the subject URI is what is keeping RDF out of general acceptance in science, so any well formed identifiable URI is useful while there is still no consensus otherwise. On the other hand though, BioMart may be in a good position to provide a discussion table for this subject though. It is a great opportunity to get a basic consensus and provide the software support for the consensus given the large number of BioMart installations out there that will soon be upgraded to 0.8 with this inbuilt RDF support available. Ideally providers should be able to define a single authoritative URL for each item and publish using it, but they haven't been able to do this easily so far. They may be able to easily define the URL format for each record in a single place for each table, for example; biomart-config: mart: kegg; table: kegg_pathway; Primary_URI_Structure: biomart://dcc-dev.res.oicr.on.ca/pathway_config_1/kegg/kegg_pathway/${pathway_id} where ${pathway_id} was replaced by that field for each record. > I have chosen the biomart:// URI to denote that this item is accessible > within the mart, but not visible as such via HTTP-URL. > > Alternatively, I could also use the same biomart:// URI for the objects > too. Whilst looking bloated, do you think that would leave you with more > opportunities how you would like to query the mart? > > For example: > <biomart://.../238947> <http://...#pathway_id> > <biomart://.../238947/hsa05200> > <biomart://.../238947> <http://...#associated_gene_name> > <biomart://.../238947/TP53> In my opinion it isn't useful to create URIs based on strings that aren't designed as permanent unambiguous identifiers. For example, the gene known as TP53 may change meaning, but hsa05200 is likely to either stay as the same meaning or be completely deprecated rather than have its meaning gradually change. This shouldn't need to be a large part of the discussion as in relational databases we always have primary key sets to fall back on and unique URIs can be created solely based on them. A mart could create URIs for other fields to either link to other marts or other tables etc, but that shouldn't need to influence the discussion of how to create the primary URI for each record. >>Just to make sure I know what the situation is, could you give me some >>sample data for these key name pairs? > The current situation is ambiguous, because I am just about to finish > the automatic ontology generation for marts, whereas the SPARQL-results > still return everything as literal. Let me know what you think of my two > suggestion I made above, please feel free to contribute your own > suggestions, and by the end of the (Canadian) day I can send you a mart > URL where you can test the SPARQL interface. Cheers, Peter _______________________________________________ Users mailing list [email protected] https://lists.biomart.org/mailman/listinfo/users
