Maybe am overthinking. If it doesn't change the final output (as regards aromatic SMILES) on ChEMBL, maybe it's not worth worrying about now.
- Noel On 30 January 2017 at 18:31, Noel O'Boyle <baoille...@gmail.com> wrote: > Great. One question I've run into is what was the intention of the D2 > etc in the SMARTS patterns. Was it the number of heavy atom neighbors? > As written, it's the number of explicit nbrs in the graph, which is > complicated by the fact that OB's SMILES parser currently adds an > explicit H for H's inside square brackets, e.g. [CH-]. So if the > patterns were developed by testing on SMILES, then the intended > D-value is somewhat unclear for patterns that typically match atoms > with hydrogens but which are written as implicit hydrogens. Confused? > I am too. :-) > > - Noel > > On 27 January 2017 at 22:17, Geoffrey Hutchison > <ge...@geoffhutchison.net> wrote: >> I should mention on that note, that a collaboration with Carnegie Mellon >> students produced a parallel implementation of Kekulization using the Eigen3 >> matrix library. They also wrote a CUDA implementation that was modestly >> faster. >> >> It hasn't been ported back to Open Babel yet, but I'll leave the basic code >> (MIT license) here: >> https://github.com/NarainKrishnamurthy/chemposer >> >> Anyone interested should let me know.. >> >> Cheers, >> -Geoff >> >> On Fri, Jan 27, 2017 at 5:13 PM, Geoffrey Hutchison >> <ge...@geoffhutchison.net> wrote: >>> >>> I think it's a great idea. Chris Morley had recommended similar concepts >>> in terms of implicit valence. >>> >>> Yes, many of the stranger SMARTS patterns here are for "dodgy" SMILES that >>> should retain aromaticity. It's possible, perhaps to set some level of "if >>> it was initially flagged as an aromatic atom, be more lenient" rules in the >>> code. >>> >>> I'd like to continue the concept of an annual release, so in the meantime, >>> I think experiments are welcome. >>> >>> -Geoff >>> >>> On Fri, Jan 27, 2017 at 3:03 AM, Noel O'Boyle <baoille...@gmail.com> >>> wrote: >>>> >>>> Hi there, >>>> >>>> Here's a heads-up on some work I've been prototyping. >>>> >>>> The aromatic atom typer currently uses SMARTS patterns in aromatic.txt >>>> to assign max/min values of pi electrons. A more efficient approach is >>>> to simultaneously match against all the SMARTS patterns rather than >>>> one at a time, and well, to avoid using SMARTS at all. >>>> >>>> I've attached a Python prototype that shows the general idea - see the >>>> function getMinMax (the calls to IsAromatic will have to be removed, >>>> but are unavoidable here; the "elif"s will become a switch statement;I >>>> need to think some more about explicit hydrogens). To my mind, the use >>>> of a direct lookup is as clear, if not clearer, than using SMARTS >>>> patterns. >>>> >>>> I note that the existing tests don't hit all of the patterns, and >>>> while I can find molecules in ChEMBL that hit almost all of the >>>> patterns, I'm not sure whether I can find ones where the corresponding >>>> atom turns out to be aromatic in the end. I have a feeling this is >>>> because the patterns were added in response to dodgy smiles (e.g. >>>> using n instead of [nH]) which were reported or found by Geoff. >>>> >>>> Regards, >>>> - Noel >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, SlashDot.org! http://sdm.link/slashdot >>>> _______________________________________________ >>>> OpenBabel-Devel mailing list >>>> OpenBabel-Devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/openbabel-devel >>>> >>> >> ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel