On 11 October 2010 22:17, Noel O'Boyle <[email protected]> wrote:
> On 11 October 2010 21:18, Chris Morley <[email protected]> wrote:
>> On 11/10/2010 20:21, Noel O'Boyle wrote:
>>>
>>> I went through a large dataset of PubChem 3D structures looking for
>>> implicit H failures (removing, then adding Hs). 1/3 of the failures
>>> are due to the following in atomtyp.txt:
>>>
>>> INTHYB  [$([#6]([#8D1])[#8D1])]   2       #sp2 carbon
>>>
>>> This makes any C attached to two Os turn into sp2...even in geminal
>>> diols , for example, ClC(Cl)(Cl)C(O)O. According to wikipedia, these
>>> don't tend to last long but they are in PubChem (whether errors or
>>> not, I can't say).
>>>
>>> If it's commented out, then it works fine for these cases.
>>>
>>> Is there any reason for this rule (it seems to date from the early
>>> days)? Perhaps it's to correct ligand structures from the PDB where
>>> all examples of this indicate COO-? If so, maybe the PDB cases are
>>> better handled in the code using the molecular geometry...?
>>
>> I have always regarded the implicit valency model as unsatisfactory, and
>> maybe it has become a bit messed up over the years. Is it currently used for
>> anything other than recognizing when a hydrogen could or should be added to
>> an atom?
>
> But this is already a substantial part, isn't it, as it determines
> whether you can read a SMILES string correctly. In general it's
> working quite well - there's a whole lot of patterns required to
> handle nitrogens though.
>
>> A simpler and more obvious model for this purpose has essentially a single
>> IMPVAL for each charge state of the molecule. (Only if you are interested in
>> radicals or hydrogen on the higher valency states of second row elements do
>> you need another rule for each higher valence.) There is no need for any
>> skilful fine tuning. It is more maintainable and will be faster because not
>> so many SMARTS patterns need to be matched. Up to now, It has worked for
>> everything I've tried (although this not very extensive), except with
>> test_formula, where the fault is in a couple of erroneous results in
>> formularesults.txt, which at least shows the old model was error-prone and
>> needs some more tweaking of phosphate structures.
>>
>> There may be other side effects I'm not aware of. Just before a release is
>> not a good time to commit something like this (5 years ago would have been
>> better), so I've just attached a patch (changes to 11 code lines), if you
>> want to try it.
>
> I'll check it out...although I'm cautious also.

Wow - if that works, it looks much better. Where have you been hiding
this code? :-)

But...I don't think we have time to figure out if it causes some
problems elsewhere. And this release, perhaps even more than others,
needs to avoid introducing bugs as much as possible. Could you wait
for MolCore?

>> Chris
>>
>> ------------------------------------------------------------------------------
>> Beautiful is writing same markup. Internet Explorer 9 supports
>> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
>> Spend less time writing and  rewriting code and more time creating great
>> experiences on the web. Be a part of the beta today.
>> http://p.sf.net/sfu/beautyoftheweb
>> _______________________________________________
>> OpenBabel-Devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>>
>>
>

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
OpenBabel-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to