Hi David, thank you for your reply. I tested your code that assigns a sp2 hybridisation to a N as e.g. in azete (in mol.cpp), and it solves the incorrect bond orders.
However, I think that this is a general problem: I have numerous compounds that get incorrect bond orders and that end up with a different charge than specified in the Gaussian output file (like e.g. sulfite and sulfur trioxide). I think it might be a good idea to have somewhere a general "charge check”. (Maybe there is and I miss to use it? I checked whether OB is getting the specified charge and it seems to get it.) For example, it would be great if there is a function that checks (i) the charge of the compound (or whether it is a radical) and (ii) the sum of bond orders to a certain atom (I guess that is done with GetExplicitValence()). If a compound is neutral (and not a radical) and ends up with e.g. a C that only has three single bonds, this should be somehow corrected. Is there any way to do this with the existing OB functions, i.e. without changing the code? With kind regards, Madeleine On 5 Jun 2021, at 01:42, David Koes <dk...@pitt.edu<mailto:dk...@pitt.edu>> wrote: Hi Madeleine, It might be helpful to start a GitHub Issue where you upload example files. I believe the main issue here is Gaussian files do not contain bond information and so OpenBabel has to infer them with ConnectTheDots (identifying bonding atoms) and PerceiveBondOrders. These functions have a lot of ad hoc heuristics to try to figure out the right answer from perhaps inadequate information. The failure with azete is in PerceiveBondOrders where a C=C double bond is preferred over the C=N bond because no hybridization was set on the N. Specifically, line 3494 in mol.cpp: if ( (b->GetHyb() == 2 || b->GetExplicitDegree() == 1) && b->GetExplicitValence() + 1 <= static_cast<unsigned int>(OBElements::GetMaxBonds(b->GetAtomicNum())) && (GetBond(atom, b))->IsDoubleBondGeometry() && (currentElNeg > maxElNeg || (IsApprox(currentElNeg,maxElNeg, 1.0e-6)) ) ) b is the N, which is a neighbor atom of the C that is under consideration. The b->GetHyb() == 2 check fails, so the C=C is formed instead of C=N to get the valence to add up. The hybridization of the N isn't set because the angle is too small for the hybridization estimation code (lines 3204-3224) to set it: if (angle > 155.0) atom->SetHyb(1); else if (angle <= 155.0 && angle > 115.0) atom->SetHyb(2); // special case for imines if (atom->GetAtomicNum() == OBElements::Nitrogen && atom->ExplicitHydrogenCount() == 1 && atom->GetExplicitDegree() == 2 && angle > 109.5) atom->SetHyb(2); We could extend this with: else if(atom->GetAtomicNum() == OBElements::Nitrogen && atom->GetExplicitDegree() == 2 && atom->IsInRing()) //azete atom->SetHyb(2); and this particular problem is fixed. It isn't clear to me if a less stringent criteria can't be applied. David Koes Associate Professor Computational & Systems Biology University of Pittsburgh On 6/3/21 11:13 AM, Marie-Madeleine Walz wrote: Hello, I would like to share another example that illustrates that OB does not seem to take charges into account when assigning bonds. Processing azete (neutral) using a Gaussian output file with OB, I obtain the following charged azete analog. Is there any part in the code that should take care of charges when assigning bonds? With kind regards, Madeleine On 28 May 2021, at 11:41, Marie-Madeleine Walz <marie-madeleine.w...@icm.uu.se<mailto:marie-madeleine.w...@icm.uu.se> <mailto:marie-madeleine.w...@icm.uu.se>> wrote: Hello, I’m working on processing Gaussian output files (g09) with OB to set bond orders. I noticed that the compound's charge information does not seem to be taken into account when setting bond orders. For example, I obtain for neutral compounds structures that are charged, e.g. carbon atoms with only three single bonds, or nitrogens with only two bonds. One of the most obvious examples is sulfite (SO_3^(2-)) vs. sulfur trioxide (SO_3). Both have their bond orders coded in bondtyp.txt. # Sulfite [#16D3]([#8D1])([#8D1-])([#8D1-]) 0 1 2 0 2 1 0 3 1 # Sulfur trioxide [#16D3]([#8D1])([#8D1])([#8D1]) 0 1 2 0 2 2 0 3 2 However, OB does not recognise sulfite as charged, and instead it assigns three double bonds (i.e. it uses the sulfur trioxide definition). If I remove the charge information from the sulfite SMARTS structure, it assigns the correct bond order. However, obviously, sulfur trioxide is also assigned this bond order as now the SMARTS structures are identical. # Sulfite [#16D3]([#8D1])([#8D1])([#8D1]) 0 1 2 0 2 1 0 3 1 Any ideas on how to solve this issue are highly appreciated. With kind regards, Madeleine Page Title När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/ E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net<mailto:OpenBabel-Devel@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
_______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel