Hi Manual, Chris is right, unfortunately the ChemDraw export isn't quite correct. It is actually possible to represent multi-attach in V3000 but it's not used here. The more common problem is that there are simply a random bond into the middle of a ring. I've done a fair bit of work on ChemDraw processing ( https://nextmovesoftware.com/blog/2016/07/28/sketchy-sketches/), the biggest issue is the ChemDraw chemical formula/abbreviation parsing, for example K2CO3 has a peroxide, HATU is a "[H]*[3H][U]", etc (I show more examples in the poster).
NextMove has a commercial tool to generate CXSMILES, for you example note the *m:* part on the end that captures the positional variation. [john@harbinger:Praline]% java -jar exec/target/praline.jar convert > ~/Downloads/structure.cdx --cxsmi > [Ru]([P](CCC1=CC=CC=C1)(C2CCCCC2)C3CCCCC3)(Cl)(Cl)*.C1(=CC=C(C=C1)C(C)C)C > |m:24:25.26.27.28.29.30| structure Molecule/Specific/High/+PVar CDK can read and handle this, we actually do get the formula wrong still though (will fix that). OpenBabel has a FOSS ChemDraw parser, one option could be to modify that and parse your examples to get the info and then generate the MOLfile/CXSMILES. The parsing is easy *NodeType="MultipleAttach" Attachments="{id1} {id2} .."* where the id's are node ids. Unfortunately I don't think they have the data structures to represent it so it would be a fair bit of work other than handling these fields. All the best, John On Wed, 2 Dec 2020 at 15:05, Christoph Steinbeck < christoph.steinb...@uni-jena.de> wrote: > Dear Manuel, > > if you open the mol file in a text editor, there are clearly 31 C atoms in > the file. > So the CDK is “right”. I also opened the file in Marvin Sketch and it > output the analysis below. > > ChemDraw uses a fishy trick, as it seems, to create the illusion of a > multi-center attachment. Clearly, they focus on publication-ready drawing > of chemical structures and not one creating correct file representations of > the chemistry. Fact is that the end of the line to the center of the > benzene ring is a carbon atom and nothing else. > > Kind regards, > > Chris > > — > Prof. Dr. Christoph Steinbeck > Analytical Chemistry - Cheminformatics and Chemometrics > Friedrich-Schiller-University Jena, Germany > Phone Secretariat: +49-3641-948171 > http://cheminf.uni-jena.de > http://orcid.org/0000-0001-6966-0814 > > What is man but that lofty spirit - that sense of enterprise. > ... Kirk, "I, Mudd," stardate 4513.3.. > > > > > > > On 2. Dec 2020, at 14:38, Stesycki, Manuel <stesy...@mpi-muelheim.mpg.de> > wrote: > > > > Dear CDK users, > > > > we are using CDK version 2.3 in our application. > > As a user tried to add a structure (see attachment) we found a > difference in the molecular formula of the structure. > > > > The original structure was draw with ChemDraw 18. > > A multi-center attachment was added to the structure and ChemDraw shows > this molecular formula: C30H46Cl2PRu > > > > Whereas our application takes the mol-version of the cdx-file and > computes this formula: C31H49Cl2PRu > > To get the formula we use this piece of code: > > > > IMolecularFormula form = > MolecularFormulaManipulator.getMolecularFormula(mol); > > sumFormula = MolecularFormulaManipulator.getString(form); > > > > Did we missed something by creating the AtomContainer? > > We create the atomcontainer directly by parsing the mol-file: > > try (StringReader sr = new StringReader(molFile); MDLV2000Reader mr = > new MDLV2000Reader(sr, mode)) { > > > > AtomContainer mol = new AtomContainer(); > > AtomContainer ac = mr.read(mol); > > } > > > > Maybe someone can give us a hint, what we are doing wrong. > > > > Best regards, > > Manuel Stesycki > > > > IT > > 0208 / 306-2146 > > Physikbau, Büro 117 > > stesy...@mpi-muelheim.mpg.de > > > > Max-Planck-Institut für Kohlenforschung > > Kaiser-Wilhelm-Platz 1 > > D-45470 Mülheim an der Ruhr > > http://www.kofo.mpg.de/de > > > > _______________________________________________ > > Cdk-user mailing list > > Cdk-user@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/cdk-user > > > > > _______________________________________________ > Cdk-user mailing list > Cdk-user@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cdk-user >
_______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user