Thanks, this was an interesting paper.  I am in fact curious about the
substructure search problem.

I would appreciate a sanity-check on my understanding of this paper.  My
impression was that partially ordered fingerprints are used in an
initial relational database comparison query to obtain a modestly sized
set of candidate structures, after which a subgraph matching algorithm
(e.g., a VF2 variant) is applied sequentially to each element of that
set to get an exact answer.  Is that the general approach?  I got the
vague impression the sequential subgraph matching, not the fingerprint
comparison query, is the performance bottleneck -- is that generally
true in this approach?

To answer the earlier question, I am interested in seeing if graph
database and description logic technologies can be applied to structure
queries.  To play around with that, I would want a true graph database
representation of structure.  I looked at ChEBI, like PubChem also
available in RDF format, and like PubChemRDF also encodes structure
using SMILES strings rather than RDF graphs.

Does anyone know of any structure database that uses an attributed graph
rather than string representation?

Does anyone know of an open source software package that can convert
SMILES strings into RDF (brass ring) or any sort of attributed graph
data structure?  What about open source tools to generate graphical
visualizations from SMILES strings?  I assume those would have this
capability buried inside them.  The CDK page cites a few export formats,
SMILES, SDF, InChI, Mol2, CML, *and others*.   Are any of the formats
attributed graph data structures?


On 12/2/2020 12:09 PM, Egon Willighagen wrote:
>
> Please have a look
> at: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0282-y
> <https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0282-y>
>
> On Tue, Dec 1, 2020 at 4:04 PM Steve Vestal
> <steve.ves...@adventiumlabs.com
> <mailto:steve.ves...@adventiumlabs.com>> wrote:
>
>     Does anyone know of a structure database that can be queried using an
>     RDF query language like SPARQL?  PubChemRDF can be accessed in RDF
>     format, but it encodes structures as SMILES strings, which cannot be
>     queried in this way.
>
>     If not, can anyone suggest open source software that might be used to
>     construct a modest RDF dataset from an existing structure database for
>     the purpose of experimenting?  For example, software that can
>     translate
>     SMILES strings into an annotated graph data structure of some sort?
>
>     Thanks in advance for any suggestions.
>
>
>
>
>     _______________________________________________
>     Blueobelisk-discuss mailing list
>     Blueobelisk-discuss@lists.sourceforge.net
>     <mailto:Blueobelisk-discuss@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss
>     <https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss>
>
>
>
> -- 
> Have you heard about Wikidata already? "Use Scholia and Wikidata to
> find scientific literature" is a new tutorial from my colleague Lauren
> Dupuis. https://laurendupuis.github.io/Scholia_tutorial/
> <https://laurendupuis.github.io/Scholia_tutorial/>
>
> -----
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/
> <http://www.bigcat.unimaas.nl/>)
> Homepage: http://egonw.github.com/ <http://egonw.github.com/>
> Blog: http://chem-bla-ics.blogspot.com/
> <http://chem-bla-ics.blogspot.com/>
> PubList: https://www.zotero.org/egonw <https://www.zotero.org/egonw>
> ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
> <https://impactstory.org/u/egonwillighagen>
_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to