[Rdkit-discuss] Two nitrogens in a 5 membered ring
Hi, If I have a five membered ring with 2 consecutive Ns and alternating single and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it gives me a kekulization error. I, possibly naively, thought the forms would be all aromatic or all aliphatic. Am I missing something or is this a bug? Chem.MolFromSmiles('N1N=CC=C1').Debug() Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 4 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 Bonds: 0 0-1 order: 12 conj?: 1 aromatic?: 1 1 1-2 order: 12 conj?: 1 aromatic?: 1 2 2-3 order: 12 conj?: 1 aromatic?: 1 3 3-4 order: 12 conj?: 1 aromatic?: 1 4 4-0 order: 12 conj?: 1 aromatic?: 1 Chem.MolFromSmiles('N1=NC=CC1').Debug() Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 4 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 Bonds: 0 0-1 order: 2 conj?: 1 aromatic?: 0 1 1-2 order: 1 conj?: 1 aromatic?: 0 2 2-3 order: 2 conj?: 1 aromatic?: 0 3 3-4 order: 1 conj?: 0 aromatic?: 0 4 4-0 order: 1 conj?: 0 aromatic?: 0 Chem.MolFromSmiles('n1nccc1').Debug() [15:31:44] Can't kekulize mol Yours, Toby Wright -- InhibOx Ltd -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring
On Mon, 3 Mar 2014 15:41:43 + Toby Wright toby.wri...@inhibox.com wrote: If I have a five membered ring with 2 consecutive Ns and alternating single and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it gives me a kekulization error. I, possibly naively, thought the forms would be all aromatic or all aliphatic. Am I missing something or is this a bug? I would say that the behavior you described is rather due to Daylight's specification of aromaticity-detection algorithm in SMILES which I assume RDKit follows. For more details see, section 3.4.2 Aromaticity of the document below http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html Following it, I think to make RDKit recognized your aromatic heterocycle properly, its SMILES should read e.g. [nH]1nccc1. Hope it helps. -- MikoĊaj Kowalik -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring
Hi Toby, I'd say it's more of a limitation inherent in Kekule representations than an actual bug in RDKit. Trying to get too clever in figuring out what the user meant usually causes more harm than good. I'm not sure what version of RDKit you're using, but the aromatic specification with an explicit hydrogen on one of the nitrogen atoms works for me: Chem.MolFromSmiles('n1[nH]ccc1').Debug(); Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 4 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 Bonds: 0 0-1 order: 12 conj?: 1 aromatic?: 1 1 1-2 order: 12 conj?: 1 aromatic?: 1 2 2-3 order: 12 conj?: 1 aromatic?: 1 3 3-4 order: 12 conj?: 1 aromatic?: 1 4 4-0 order: 12 conj?: 1 aromatic?: 1 The double bonds in the Kekule representations here can be between atom pairs 1,2 and 3,4 or between atom pairs 2,3 and 4,0. Putting one between pair 0,1 leaves atom 4 with two single bonds to it (and therefore, to satisfy valence requirements, two implicit hydrogens); I'm not horribly surprised that RDKit perceives that as aliphatic. You can see that's what's happening in your second example where the hybridization of atom 4 is 4 (sp3) instead of 3 (sp2). Regards, Bob -- Bob Funchess, Ph.D.Kelaroo, Inc Senior Scientist www.kelaroo.com bfunch...@kelaroo.com (858) 259-7561 x3 *From:* Toby Wright [mailto:toby.wri...@inhibox.com] *Sent:* Monday, March 03, 2014 7:42 AM *To:* RDKit Discuss *Subject:* [Rdkit-discuss] Two nitrogens in a 5 membered ring Hi, If I have a five membered ring with 2 consecutive Ns and alternating single and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it gives me a kekulization error. I, possibly naively, thought the forms would be all aromatic or all aliphatic. Am I missing something or is this a bug? Chem.MolFromSmiles('N1N=CC=C1').Debug() Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 4 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 Bonds: 0 0-1 order: 12 conj?: 1 aromatic?: 1 1 1-2 order: 12 conj?: 1 aromatic?: 1 2 2-3 order: 12 conj?: 1 aromatic?: 1 3 3-4 order: 12 conj?: 1 aromatic?: 1 4 4-0 order: 12 conj?: 1 aromatic?: 1 Chem.MolFromSmiles('N1=NC=CC1').Debug() Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0 4 6 C chg: 0 deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0 Bonds: 0 0-1 order: 2 conj?: 1 aromatic?: 0 1 1-2 order: 1 conj?: 1 aromatic?: 0 2 2-3 order: 2 conj?: 1 aromatic?: 0 3 3-4 order: 1 conj?: 0 aromatic?: 0 4 4-0 order: 1 conj?: 0 aromatic?: 0 Chem.MolFromSmiles('n1nccc1').Debug() [15:31:44] Can't kekulize mol Yours, Toby Wright -- InhibOx Ltd -- No virus found in this message. Checked by AVG - www.avg.com Version: 2014.0.4335 / Virus Database: 3697/7090 - Release Date: 02/13/14 Internal Virus Database is out of date. -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit on Windows
Hi All, I have just rebuilt RDKit on Windows using the latest source, and am seeing a problem with smaTest1 failing (as well as still seeing the same DbCLI failure posted previously...) The smaTest1 failure seems a little strange because it actually throws a Windows executable error (smaTest1.exe has stopped working, etc). If I run ctest -V -R smaTest1 I see the output below. Any thoughts? Kind regards James C:\RDKit\buildctest -V -R smaTest1 UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl UpdateCTestConfiguration from :C:/RDKit/build/DartConfiguration.tcl Test project C:/RDKit/build Constructing a list of tests Done constructing a list of tests Checking test dependency graph... Checking test dependency graph end test 32 Start 32: smaTest1 32: Test command: C:\RDKit\build\Code\GraphMol\SmilesParse\Release\smaTest1.exe 32: Test timeout computed to be: 9.99988e+006 32: [17:42:57] - 32: [17:42:57] Testing patterns which should parse. 32: [17:42:57] SMARTS Parse Error: syntax error for input: c1b1 32: [17:42:57] 32: 32: 32: Invariant Violation 32: c1b1 32: Violation occurred on line 90 in file ..\..\..\..\Code\GraphMol\SmilesParse\smatest.cpp 32: Failed Expression: mol 32: 32: 1/1 Test #32: smaTest1 .***Failed4.03 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 4.29 sec The following tests FAILED: 32 - smaTest1 (Failed) Errors while running CTest __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring
Bob hit the nail on the head. The first case, N1N=CC=C1, is aromatic because the RDKit sees that the first nitrogen has two bonds to it, assigns a hydrogen, and then sees a conjugated pi system with 6 electrons that is flagged as aromatic. Something similar would happen with the aromatic form [nH]1nccc1: first the ring system is kekulized to yield N1N=CC=C1, then the sanitization proceeds from there. The same thing would happen with the equivalent n1[nH]ccc1. The second case, N1=NC=CC1, has a C (the last one) that only has single bonds to it. This is assigned sp3 hybridization, so there's no conjugated ring system for aromaticity to be perceived in. The final case, n1nccc1, is an instance of the pyrrole problem: aromatic N's that need an implicit H on them, should have that implicit H present in the aromatic SMILES. -greg On Mon, Mar 3, 2014 at 5:59 PM, Bob Funchess bfunch...@kelaroo.com wrote: Hi Toby, I'd say it's more of a limitation inherent in Kekule representations than an actual bug in RDKit. Trying to get too clever in figuring out what the user meant usually causes more harm than good. I'm not sure what version of RDKit you're using, but the aromatic specification with an explicit hydrogen on one of the nitrogen atoms works for me: Chem.MolFromSmiles('n1[nH]ccc1').Debug(); Atoms: 0 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 1 7 N chg: 0 deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0 2 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 3 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 4 6 C chg: 0 deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0 Bonds: 0 0-1 order: 12 conj?: 1 aromatic?: 1 1 1-2 order: 12 conj?: 1 aromatic?: 1 2 2-3 order: 12 conj?: 1 aromatic?: 1 3 3-4 order: 12 conj?: 1 aromatic?: 1 4 4-0 order: 12 conj?: 1 aromatic?: 1 The double bonds in the Kekule representations here can be between atom pairs 1,2 and 3,4 or between atom pairs 2,3 and 4,0. Putting one between pair 0,1 leaves atom 4 with two single bonds to it (and therefore, to satisfy valence requirements, two implicit hydrogens); I'm not horribly surprised that RDKit perceives that as aliphatic. You can see that's what's happening in your second example where the hybridization of atom 4 is 4 (sp3) instead of 3 (sp2). Regards, Bob -- Bob Funchess, Ph.D. Kelaroo, Inc Senior Scientist www.kelaroo.com bfunch...@kelaroo.com (858) 259-7561 x3 -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Building RDKit on Windows
Hi James, On Mon, Mar 3, 2014 at 6:45 PM, James Davidson j.david...@vernalis.comwrote: I have just rebuilt RDKit on Windows using the latest source, and am seeing a problem with smaTest1 failing (as well as still seeing the same DbCLI failure posted previously...) The smaTest1 failure seems a little strange because it actually throws a Windows executable error (smaTest1.exe has stopped working, etc). If I run ctest -V -R smaTest1 I see the output below. Any thoughts? I checked in some changes to allow the aromatic b in mid-February. When I did so, I added a test to make sure it's working. You've got the test (that's the one that is failing), but it looks like the build did not regenerate the parser files it needs to. You should be able to clear this up by re-configuring cmake and regenerating the build files. After you do this, check to see that $RDBASE/Code/GraphMol/SmilesParse/smarts.tab.cpp has been freshly re-created. -greg -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] editing molecules
Hi, On Sun, Mar 2, 2014 at 3:57 AM, S.L. Chan slch...@yahoo.com wrote: Good evening, I have some molecules with neutral amidine heads. I would like to make them positively charged. So I do: mol = Chem.MolFromMolFile('input.mol', removeHs=False) q = Chem.MolFromSmarts('[NH]=C[NH2]') for mat in mol.GetSubstructMatches(q): mol.GetAtomWithIdx(mat[0]).SetFormalCharge(1) mol.GetAtomWithIdx(mat[0]).SetNumExplicitHs(2) m1 = Chem.AddHs(mol, addCoords=True) However, RDKit complains that the valence for the N in concern is 4, which is greater than permitted. You need to sanitize the molecule after you change the charge. This will automatically adjust the H count for you: In [5]: mol = Chem.MolFromSmiles('CC(=N)N') In [6]: q = Chem.MolFromSmarts('[NH]=C[NH2]') In [7]: for mat in mol.GetSubstructMatches(q): mol.GetAtomWithIdx(mat[0]).SetFormalCharge(1) ...: In [8]: Chem.SanitizeMol(mol) Out[8]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE In [11]: mh=Chem.AddHs(mol) In [12]: print Chem.MolToSmiles(mol) CC(N)=[NH2+] In [13]: print Chem.MolToSmiles(mh) [H]N([H])C(=[N+]([H])[H])C([H])([H])[H] Best, -greg I also tried it without the SetNumExplicitHs line. While it did not crash, it refused to add any H to the charged N. Attached is an example input file. But I got the same result with other input files too. Have I missed some steps? Thank you. Ling -- Flow-based real-time traffic analytics software. Cisco certified tool. Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer Customize your own dashboards, set traffic alerts and generate reports. Network behavioral analysis security monitoring. All-in-one tool. http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Subversion Kills Productivity. Get off Subversion Make the Move to Perforce. With Perforce, you get hassle-free workflows. Merge that actually works. Faster operations. Version large binaries. Built-in WAN optimization and the freedom to use Git, Perforce or both. Make the move to Perforce. http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss