[Rdkit-discuss] Two nitrogens in a 5 membered ring

2014-03-03 Thread Toby Wright
Hi,

If I have a five membered ring with 2 consecutive Ns and alternating single
and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a
molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives
me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it
gives me a kekulization error. I, possibly naively, thought the forms would
be all aromatic or all aliphatic. Am I missing something or is this a bug?

 Chem.MolFromSmiles('N1N=CC=C1').Debug()
Atoms:
0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0
1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0
2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
4 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
Bonds:
0 0-1 order: 12 conj?: 1 aromatic?: 1
1 1-2 order: 12 conj?: 1 aromatic?: 1
2 2-3 order: 12 conj?: 1 aromatic?: 1
3 3-4 order: 12 conj?: 1 aromatic?: 1
4 4-0 order: 12 conj?: 1 aromatic?: 1

 Chem.MolFromSmiles('N1=NC=CC1').Debug()
Atoms:
0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0
1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0
2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0
3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0
4 6 C chg: 0  deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0
Bonds:
0 0-1 order: 2 conj?: 1 aromatic?: 0
1 1-2 order: 1 conj?: 1 aromatic?: 0
2 2-3 order: 2 conj?: 1 aromatic?: 0
3 3-4 order: 1 conj?: 0 aromatic?: 0
4 4-0 order: 1 conj?: 0 aromatic?: 0

 Chem.MolFromSmiles('n1nccc1').Debug()
[15:31:44] Can't kekulize mol

Yours,

Toby Wright

--
InhibOx Ltd
--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring

2014-03-03 Thread Mikolaj Kowalik
On Mon, 3 Mar 2014 15:41:43 +
Toby Wright toby.wri...@inhibox.com wrote:

 If I have a five membered ring with 2 consecutive Ns and alternating single
 and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a
 molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives
 me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it
 gives me a kekulization error. I, possibly naively, thought the forms would
 be all aromatic or all aliphatic. Am I missing something or is this a bug?

I would say that the behavior you described is rather due to Daylight's
specification of  aromaticity-detection algorithm in SMILES which I assume
RDKit follows.

For more details see, section 3.4.2 Aromaticity of the document below

http://www.daylight.com/dayhtml/doc/theory/theory.smiles.html

Following it, I think to make RDKit recognized your aromatic heterocycle
properly, its SMILES should read e.g. [nH]1nccc1.

Hope it helps.

-- 
MikoĊ‚aj Kowalik

--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring

2014-03-03 Thread Bob Funchess
Hi Toby,



I'd say it's more of a limitation inherent in Kekule representations than
an actual bug in RDKit.  Trying to get too clever in figuring out what
the user meant usually causes more harm than good.



I'm not sure what version of RDKit you're using, but the aromatic
specification with an explicit hydrogen on one of the nitrogen atoms works
for me:



 Chem.MolFromSmiles('n1[nH]ccc1').Debug();

Atoms:

0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0

1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0

2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

4 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

Bonds:

0 0-1 order: 12 conj?: 1 aromatic?: 1

1 1-2 order: 12 conj?: 1 aromatic?: 1

2 2-3 order: 12 conj?: 1 aromatic?: 1

3 3-4 order: 12 conj?: 1 aromatic?: 1

4 4-0 order: 12 conj?: 1 aromatic?: 1



The double bonds in the Kekule representations here can be between atom
pairs 1,2 and 3,4 or between atom pairs 2,3 and 4,0.  Putting one between
pair 0,1 leaves atom 4 with two single bonds to it (and therefore, to
satisfy valence requirements, two implicit hydrogens); I'm not horribly
surprised that RDKit perceives that as aliphatic.  You can see that's
what's happening in your second example where the hybridization of atom 4
is 4 (sp3) instead of 3 (sp2).



Regards,

Bob



--

Bob Funchess, Ph.D.Kelaroo,
Inc

Senior Scientist
www.kelaroo.com

bfunch...@kelaroo.com (858)
259-7561 x3







*From:* Toby Wright [mailto:toby.wri...@inhibox.com]
*Sent:* Monday, March 03, 2014 7:42 AM
*To:* RDKit Discuss
*Subject:* [Rdkit-discuss] Two nitrogens in a 5 membered ring



Hi,

If I have a five membered ring with 2 consecutive Ns and alternating single
and double bonds expressed by the smiles: N1N=CC=C1 RDKit gives me a
molecule in which every atom is aromatic. If I give it: N1=NC=CC1 it gives
me a molecule in which every atom is aliphatic. If I give it: n1nccc1 it
gives me a kekulization error. I, possibly naively, thought the forms would
be all aromatic or all aliphatic. Am I missing something or is this a bug?

 Chem.MolFromSmiles('N1N=CC=C1').Debug()
Atoms:
0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0
1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0
2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
4 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0
Bonds:
0 0-1 order: 12 conj?: 1 aromatic?: 1
1 1-2 order: 12 conj?: 1 aromatic?: 1
2 2-3 order: 12 conj?: 1 aromatic?: 1
3 3-4 order: 12 conj?: 1 aromatic?: 1
4 4-0 order: 12 conj?: 1 aromatic?: 1

 Chem.MolFromSmiles('N1=NC=CC1').Debug()
Atoms:
0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0
1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 0 chi: 0
2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0
3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 0 chi: 0
4 6 C chg: 0  deg: 2 exp: 2 imp: 2 hyb: 4 arom?: 0 chi: 0
Bonds:
0 0-1 order: 2 conj?: 1 aromatic?: 0
1 1-2 order: 1 conj?: 1 aromatic?: 0
2 2-3 order: 2 conj?: 1 aromatic?: 0
3 3-4 order: 1 conj?: 0 aromatic?: 0
4 4-0 order: 1 conj?: 0 aromatic?: 0

 Chem.MolFromSmiles('n1nccc1').Debug()
[15:31:44] Can't kekulize mol

Yours,

Toby Wright

--

InhibOx Ltd
--

No virus found in this message.
Checked by AVG - www.avg.com
Version: 2014.0.4335 / Virus Database: 3697/7090 - Release Date: 02/13/14
Internal Virus Database is out of date.
--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on Windows

2014-03-03 Thread James Davidson
Hi All,

I have just rebuilt RDKit on Windows using the latest source, and am seeing a 
problem with smaTest1 failing (as well as still seeing the same DbCLI failure 
posted previously...)
The smaTest1 failure seems a little strange because it actually throws a 
Windows executable error  (smaTest1.exe has stopped working, etc).
If I run ctest -V -R smaTest1 I see the output below.  Any thoughts?

Kind regards

James



C:\RDKit\buildctest -V -R smaTest1
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/RDKit/build/DartConfiguration.tcl
Test project C:/RDKit/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 32
Start 32: smaTest1

32: Test command: C:\RDKit\build\Code\GraphMol\SmilesParse\Release\smaTest1.exe
32: Test timeout computed to be: 9.99988e+006
32: [17:42:57] -
32: [17:42:57] Testing patterns which should parse.
32: [17:42:57] SMARTS Parse Error: syntax error for input: c1b1
32: [17:42:57]
32:
32: 
32: Invariant Violation
32: c1b1
32: Violation occurred on line 90 in file 
..\..\..\..\Code\GraphMol\SmilesParse\smatest.cpp
32: Failed Expression: mol
32: 
32:
1/1 Test #32: smaTest1 .***Failed4.03 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   4.29 sec

The following tests FAILED:
 32 - smaTest1 (Failed)
Errors while running CTest

__
PLEASE READ: This email is confidential and may be privileged. It is intended 
for the named addressee(s) only and access to it by anyone else is 
unauthorised. If you are not an addressee, any disclosure or copying of the 
contents of this email or any action taken (or not taken) in reliance on it is 
unauthorised and may be unlawful. If you have received this email in error, 
please notify the sender or postmas...@vernalis.com. Email is not a secure 
method of communication and the Company cannot accept responsibility for the 
accuracy or completeness of this message or any attachment(s). Please check 
this email for virus infection for which the Company accepts no responsibility. 
If verification of this email is sought then please request a hard copy. Unless 
otherwise stated, any views or opinions presented are solely those of the 
author and do not represent those of the Company.

The Vernalis Group of Companies
100 Berkshire Place
Wharfedale Road
Winnersh, Berkshire
RG41 5RD, England
Tel: +44 (0)118 938 

To access trading company registration and address details, please go to the 
Vernalis website at www.vernalis.com and click on the Company address and 
registration details link at the bottom of the page..
__--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Two nitrogens in a 5 membered ring

2014-03-03 Thread Greg Landrum
Bob hit the nail on the head.

The first case, N1N=CC=C1, is aromatic because the RDKit sees that the
first nitrogen has two bonds to it, assigns a hydrogen, and then sees a
conjugated pi system with 6 electrons that is flagged as aromatic.
Something similar would happen with the aromatic form [nH]1nccc1: first the
ring system is kekulized to yield N1N=CC=C1, then the sanitization proceeds
from there. The same thing would happen with the equivalent n1[nH]ccc1.

The second case, N1=NC=CC1, has a C (the last one) that only has single
bonds to it. This is assigned sp3 hybridization, so there's no conjugated
ring system for aromaticity to be perceived in.

The final case, n1nccc1, is an instance of the pyrrole problem: aromatic
N's that need an implicit H on them, should have that implicit H present in
the aromatic SMILES.

-greg




On Mon, Mar 3, 2014 at 5:59 PM, Bob Funchess bfunch...@kelaroo.com wrote:

 Hi Toby,



 I'd say it's more of a limitation inherent in Kekule representations than
 an actual bug in RDKit.  Trying to get too clever in figuring out what
 the user meant usually causes more harm than good.



 I'm not sure what version of RDKit you're using, but the aromatic
 specification with an explicit hydrogen on one of the nitrogen atoms works
 for me:



  Chem.MolFromSmiles('n1[nH]ccc1').Debug();

 Atoms:

 0 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0

 1 7 N chg: 0  deg: 2 exp: 3 imp: 0 hyb: 3 arom?: 1 chi: 0

 2 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

 3 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

 4 6 C chg: 0  deg: 2 exp: 3 imp: 1 hyb: 3 arom?: 1 chi: 0

 Bonds:

 0 0-1 order: 12 conj?: 1 aromatic?: 1

 1 1-2 order: 12 conj?: 1 aromatic?: 1

 2 2-3 order: 12 conj?: 1 aromatic?: 1

 3 3-4 order: 12 conj?: 1 aromatic?: 1

 4 4-0 order: 12 conj?: 1 aromatic?: 1



 The double bonds in the Kekule representations here can be between atom
 pairs 1,2 and 3,4 or between atom pairs 2,3 and 4,0.  Putting one between
 pair 0,1 leaves atom 4 with two single bonds to it (and therefore, to
 satisfy valence requirements, two implicit hydrogens); I'm not horribly
 surprised that RDKit perceives that as aliphatic.  You can see that's
 what's happening in your second example where the hybridization of atom 4
 is 4 (sp3) instead of 3 (sp2).



 Regards,

 Bob



 --

 Bob Funchess, Ph.D.
 Kelaroo, Inc

 Senior Scientist
 www.kelaroo.com

 bfunch...@kelaroo.com (858)
 259-7561 x3

--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Building RDKit on Windows

2014-03-03 Thread Greg Landrum
Hi James,

On Mon, Mar 3, 2014 at 6:45 PM, James Davidson j.david...@vernalis.comwrote:



  I have just rebuilt RDKit on Windows using the latest source, and am
 seeing a problem with smaTest1 failing (as well as still seeing the same
 DbCLI failure posted previously...)

 The smaTest1 failure seems a little strange because it actually throws a
 Windows executable error  (smaTest1.exe has stopped working, etc).

 If I run ctest -V -R smaTest1 I see the output below.  Any thoughts?


I checked in some changes to allow the aromatic b in mid-February. When I
did so, I added a test to make sure it's working. You've got the test
(that's the one that is failing), but it looks like the build did not
regenerate the parser files it needs to. You should be able to clear this
up by re-configuring cmake and regenerating the build files. After you do
this, check to see that $RDBASE/Code/GraphMol/SmilesParse/smarts.tab.cpp
has been freshly re-created.

-greg
--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] editing molecules

2014-03-03 Thread Greg Landrum
Hi,


On Sun, Mar 2, 2014 at 3:57 AM, S.L. Chan slch...@yahoo.com wrote:

 Good evening,

 I have some molecules with neutral amidine heads. I would like to
 make them positively charged. So I do:

 mol = Chem.MolFromMolFile('input.mol', removeHs=False)
 q = Chem.MolFromSmarts('[NH]=C[NH2]')
 for mat in mol.GetSubstructMatches(q):
 mol.GetAtomWithIdx(mat[0]).SetFormalCharge(1)
 mol.GetAtomWithIdx(mat[0]).SetNumExplicitHs(2)

 m1 = Chem.AddHs(mol, addCoords=True)

 However, RDKit complains that the valence for the N in concern is 4,
 which is greater than permitted.


You need to sanitize the molecule after you change the charge. This will
automatically adjust the H count for you:

In [5]: mol = Chem.MolFromSmiles('CC(=N)N')
In [6]: q = Chem.MolFromSmarts('[NH]=C[NH2]')
In [7]: for mat in mol.GetSubstructMatches(q):
mol.GetAtomWithIdx(mat[0]).SetFormalCharge(1)
   ...:

In [8]: Chem.SanitizeMol(mol)
Out[8]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [11]: mh=Chem.AddHs(mol)

In [12]: print Chem.MolToSmiles(mol)
CC(N)=[NH2+]

In [13]: print Chem.MolToSmiles(mh)
[H]N([H])C(=[N+]([H])[H])C([H])([H])[H]


Best,
-greg






 I also tried it without the SetNumExplicitHs line. While it did not
 crash, it refused to add any H to the charged N.

 Attached is an example input file. But I got the same result with
 other input files too.

 Have I missed some steps?

 Thank you.

 Ling



 --
 Flow-based real-time traffic analytics software. Cisco certified tool.
 Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
 Customize your own dashboards, set traffic alerts and generate reports.
 Network behavioral analysis  security monitoring. All-in-one tool.

 http://pubads.g.doubleclick.net/gampad/clk?id=126839071iu=/4140/ostg.clktrk
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


--
Subversion Kills Productivity. Get off Subversion  Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951iu=/4140/ostg.clktrk___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss