Re: [BlueObelisk-discuss] Structure database that can be queried by SPARQL?

Cristian Bologa Thu, 03 Dec 2020 07:59:01 -0800

What about this paper, Steve? It is based on the the one mentioned by Egon.



Interoperable chemical structure search service

https://jcheminf.biomedcentral.com/articles/10.1186/s13321-019-0367-2


Results
We present a SPARQL service that augments existing semantic services by making 
interoperable substructure and similarity searches in small-molecule databases 
possible. The service thus offers new possibilities for querying interoperable 
databases, and simplifies writing of heterogeneous queries that include 
chemical-structure search terms.

Regards,
Cristian


Cristian Bologa, Ph.D.
Research Professor,
Div. of Translational Informatics,
Dept. of Internal Medicine,
Univ. of New Mexico, School of Medicine,
Innovation Discovery&Training Center, MSC09 5025,
700 Camino de Salud NE, Albuquerque, NM 87131
tel: +1 (505) 925-7534
fax:+1 (505) 925-7625
----------------------
"If you never fail, it means you are not trying hard enough"



________________________________
From: Steve Vestal <steve.ves...@adventiumlabs.com>
Sent: Thursday, December 3, 2020 4:08 AM
To: Egon Willighagen
Cc: BlueObelisk-Discuss
Subject: Re: [BlueObelisk-discuss] Structure database that can be queried by 
SPARQL?


[[-- External - this message has been sent from outside the University --]]

Thanks, this was an interesting paper.  I am in fact curious about the 
substructure search problem.

I would appreciate a sanity-check on my understanding of this paper.  My 
impression was that partially ordered fingerprints are used in an initial 
relational database comparison query to obtain a modestly sized set of 
candidate structures, after which a subgraph matching algorithm (e.g., a VF2 
variant) is applied sequentially to each element of that set to get an exact 
answer.  Is that the general approach?  I got the vague impression the 
sequential subgraph matching, not the fingerprint comparison query, is the 
performance bottleneck -- is that generally true in this approach?

To answer the earlier question, I am interested in seeing if graph database and 
description logic technologies can be applied to structure queries.  To play 
around with that, I would want a true graph database representation of 
structure.  I looked at ChEBI, like PubChem also available in RDF format, and 
like PubChemRDF also encodes structure using SMILES strings rather than RDF 
graphs.

Does anyone know of any structure database that uses an attributed graph rather 
than string representation?

Does anyone know of an open source software package that can convert SMILES 
strings into RDF (brass ring) or any sort of attributed graph data structure?  
What about open source tools to generate graphical visualizations from SMILES 
strings?  I assume those would have this capability buried inside them.  The 
CDK page cites a few export formats, SMILES, SDF, InChI, Mol2, CML, *and 
others*.   Are any of the formats attributed graph data structures?


On 12/2/2020 12:09 PM, Egon Willighagen wrote:

Please have a look at: 
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0282-y

On Tue, Dec 1, 2020 at 4:04 PM Steve Vestal 
<steve.ves...@adventiumlabs.com<mailto:steve.ves...@adventiumlabs.com>> wrote:
Does anyone know of a structure database that can be queried using an
RDF query language like SPARQL?  PubChemRDF can be accessed in RDF
format, but it encodes structures as SMILES strings, which cannot be
queried in this way.

If not, can anyone suggest open source software that might be used to
construct a modest RDF dataset from an existing structure database for
the purpose of experimenting?  For example, software that can translate
SMILES strings into an annotated graph data structure of some sort?

Thanks in advance for any suggestions.




_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net<mailto:Blueobelisk-discuss@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss


--
Have you heard about Wikidata already? "Use Scholia and Wikidata to find 
scientific literature" is a new tutorial from my colleague Lauren Dupuis. 
https://laurendupuis.github.io/Scholia_tutorial/

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286<http://orcid.org/0000-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen

_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Re: [BlueObelisk-discuss] Structure database that can be queried by SPARQL?

Reply via email to