On Tue, Apr 17, 2012 at 7:56 AM, Noel O'Boyle <baoille...@gmail.com> wrote:
> Well, Geoff, if you're going to be working on this I've recently been
> subjecting ChEMBL to some canonicalisation tests, and can supply a good few
> more test cases. I'll wrap them up and email them to you off list tomorrow.

Would it be possible to get all of these cases -- both the Kekule'
form and the "correct" canonicalized aromatic versions of them?

I more-or-less threw up my hands on getting the OpenSMILES definition
of aromaticity nailed down, due in part to the fact that I'm not a
chemist and I was never able to spark a conversation to resolve it.
Now I'm thinking that if we have a comprehensive set of example
structures, it might be possible to do it by spelling out a few fairly
simple rules, and then enumerate a set of exceptions or special cases.
 Or maybe given a complete set of examples, someone can actually
create a set of rules that handles every case.

Craig

>
> - Noel
>
> On 17 April 2012 15:11, Geoffrey Hutchison <ge...@geoffhutchison.net> wrote:
>>
>> > Let me start with a little more background on the problem. I am using
>> > Pybel to extract the information I need about a set of ~875 PAH molecules
>> > (including alkyl substituted and radical PAHs).
>> ...
>> > "signature" of an error is typically that a C atom is labelled as sp3
>> > hybridized when it only has three atoms attached. (I have since learned 
>> > that
>> > I can correct the labeling of one of the molecules by reordering the C
>> > atoms.)
>>
>> Quick question -- can we turn this data set into a unit test to distribute
>> with Open Babel? I wrote up a few fused aromatics into one of the tests, and
>> we've added through bug reports. But this is definitely the most systematic
>> torture test of Kekulization that I've seen.
>>
>> > I have worked quite a bit with two of the molecules, azulene and
>> > 2175908. I have tried to reorder the atoms, convert to 2d, create a mol 
>> > file
>> > using openbabel, remove hydrogens and then convert to 2d, etc. None of 
>> > these
>> > things has helped. However, when I create the same molecule in ChemDraw,
>> > openbabel does label the aromaticity correctly.
>>
>> Right. The problem with XYZ format is that Open Babel has to work out all
>> the bond orders from scratch, while in ChemDraw, it just has to detect that
>> it's an aromatic system.
>>
>> As Noel can tell you, we've worked through plenty of rare, subtle Kekule
>> bugs across versions, so this will definitely help us stomp out more of
>> them.
>>
>> If no one else goes for it, I should have some time on Thursday to sift
>> through the code and fix this.
>>
>> Thanks,
>> -Geoff
>
>
>
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to