On Nov 3, 2009, at 11:06 AM, Vincent Le Guilloux wrote:

> When I load the same molecule but with all bond types defined at 4
> (aromatic), the CDK will directly set the aromatic flag to 1, but will
> define all bonds as SINGLE bonds.
>
> Now if I pass a benzene molecule with bond types defined at 4 through
> the CDK, and if I regenerate the SDF file without hydrogen, I will
> get... cyclohexane. If I add explicit hydrogen, I will get something
> that could be interpreted as benzene missing all double bonds, or
> cyclohexane missing one hydrogen for each carbon atom. Bellow is given
> a small snippet example.
>
> Maybe I'm doing something wrong, in which case I'm talking for
> nothing. If not, as such bond definition is not that uncommon (at
> least to my modest knowledge), I think this is an important issue.

Indeed, this is an important issue and a number of bugs in different  
subsystems occur because of this. Even, two molecules input from the  
same SMILES but one is aromatic and one is kekule, will differ - even  
after aromaticity detection on both of them (because the single/double  
bond assignments are not fixed)

The problem is that if a method (needs to) looks at bond order then it  
will be confused due to this. I suppose they should also be updated to  
check aromaticity, if it's relevant.

> Just one question beside this discussion: why not putting the aromatic
> bond flag in the output in such cases?

After doing aromaticity detection, aromatic bonds are marked as such.  
I'm not sure I understand what you mean hear.

>
> I also would have one suggestion on how to improve the CDK: defining a
> clear (customizable), documented standardization protocol which would
> include issues like aromatization/dearomatization, ionization,
> tautomerization, 2D/3D cleaning... The best example to my knowledge is
> the chemaxon's standardizer:
> http://chemaxon.com/jchem/doc/user/Standardizer.html


Indeed. This is useful and necessary. There is some code here at the  
NCGC that can be used for this. It is based on ChemAxon code, but  
should be convertible to the CDK (though it'll require some additional  
CDK methods to be implemented)

----------------------------------------------------
Rajarshi Guha        | NIH Chemical Genomics Center
http://www.rguha.net | http://ncgc.nih.gov
----------------------------------------------------
Every nonzero finite dimensional inner product
space has an orthonormal basis.
It makes sense, when you don't think about it.



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to