I figured that since I haven't seen XYZ being used for a couple of decades, I didn't need to worry about it that much. ;)
PDB input is more useful, but that's a specialized topic in its own right, with careful attention to PDB naming conventions, bond angles, ring flatness, etc. I also had problems getting the mol2 reader to work. It doesn't look like it's had any TLC for over a decade, again, probably because no one uses it. So I'm leaving the supported CDK formats in chemfp to SMILES, SDF, and InChI. And if no one asks for other formats then I don't need to do anything. Andrew da...@dalkescientific.com > On Jan 20, 2021, at 08:29, Egon Willighagen <egon.willigha...@gmail.com> > wrote: > > > Yes, I understand that. I think the original workflow was first to establish > the bonds ("single" by default) and then use valency information and hoping > that the structure makes sense (no hydrogens missing), etc. The book has a > follow up section on that second step. But I agree with your doubt about how > well it works. I do not remember if I did a validation like we would do > nowadays. It would be good to do something like this: > > 1. take 1000 random 3D structures from PubChem (add zeros according to taste) > 2. remove all bond info, and keep only the 3D locations > 3. rebond, add missing bond orders > 4. compare. > > Now, this set would not really be a real world scenario. I could imagine one > would like to repeat this for structures from COD too. Actually, maybe check > out this paper: > https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0279-6 > > Egon > _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user