I figured that since I haven't seen XYZ being used for a couple of decades, I 
didn't need to worry about it that much. ;)

PDB input is more useful, but that's a specialized topic in its own right, with 
careful attention to PDB naming conventions, bond angles, ring flatness, etc.

I also had problems getting the mol2 reader to work. It doesn't look like it's 
had any TLC for over a decade, again, probably because no one uses it.

So I'm leaving the supported CDK formats in chemfp to SMILES, SDF, and InChI. 
And if no one asks for other formats then I don't need to do anything.


                                Andrew
                                da...@dalkescientific.com


 

> On Jan 20, 2021, at 08:29, Egon Willighagen <egon.willigha...@gmail.com> 
> wrote:
> 
> 
> Yes, I understand that. I think the original workflow was first to establish 
> the bonds ("single" by default) and then use valency information and hoping 
> that the structure makes sense (no hydrogens missing), etc. The book has a 
> follow up section on that second step. But I agree with your doubt about how 
> well it works. I do not remember if I did a validation like we would do 
> nowadays. It would be good to do something like this:
> 
> 1. take 1000 random 3D structures from PubChem (add zeros according to taste)
> 2. remove all bond info, and keep only the 3D locations
> 3. rebond, add missing bond orders
> 4. compare.
> 
> Now, this set would not really be a real world scenario. I could imagine one 
> would like to repeat this for structures from COD too. Actually, maybe check 
> out this paper: 
> https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0279-6
> 
> Egon
> 




_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to