On 11 October 2010 21:18, Chris Morley <c.mor...@gaseq.co.uk> wrote:
> On 11/10/2010 20:21, Noel O'Boyle wrote:
>>
>> I went through a large dataset of PubChem 3D structures looking for
>> implicit H failures (removing, then adding Hs). 1/3 of the failures
>> are due to the following in atomtyp.txt:
>>
>> INTHYB  [$([#6]([#8D1])[#8D1])]   2       #sp2 carbon
>>
>> This makes any C attached to two Os turn into sp2...even in geminal
>> diols , for example, ClC(Cl)(Cl)C(O)O. According to wikipedia, these
>> don't tend to last long but they are in PubChem (whether errors or
>> not, I can't say).
>>
>> If it's commented out, then it works fine for these cases.
>>
>> Is there any reason for this rule (it seems to date from the early
>> days)? Perhaps it's to correct ligand structures from the PDB where
>> all examples of this indicate COO-? If so, maybe the PDB cases are
>> better handled in the code using the molecular geometry...?
>
> I have always regarded the implicit valency model as unsatisfactory, and
> maybe it has become a bit messed up over the years. Is it currently used for
> anything other than recognizing when a hydrogen could or should be added to
> an atom?

But this is already a substantial part, isn't it, as it determines
whether you can read a SMILES string correctly. In general it's
working quite well - there's a whole lot of patterns required to
handle nitrogens though.

> A simpler and more obvious model for this purpose has essentially a single
> IMPVAL for each charge state of the molecule. (Only if you are interested in
> radicals or hydrogen on the higher valency states of second row elements do
> you need another rule for each higher valence.) There is no need for any
> skilful fine tuning. It is more maintainable and will be faster because not
> so many SMARTS patterns need to be matched. Up to now, It has worked for
> everything I've tried (although this not very extensive), except with
> test_formula, where the fault is in a couple of erroneous results in
> formularesults.txt, which at least shows the old model was error-prone and
> needs some more tweaking of phosphate structures.
>
> There may be other side effects I'm not aware of. Just before a release is
> not a good time to commit something like this (5 years ago would have been
> better), so I've just attached a patch (changes to 11 code lines), if you
> want to try it.

I'll check it out...although I'm cautious also.

> Chris
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today.
> http://p.sf.net/sfu/beautyoftheweb
> _______________________________________________
> OpenBabel-Devel mailing list
> OpenBabel-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>
>

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to