Re: [Rdkit-discuss] SMARTS for heteroaromatic rings?
Greg, You are suggesting some interesting ideas. Probably matching of atoms in 5 and 6-membered aromatic rings will be sufficient for now. I was initially stumped trying to figure out an elegant way to deal with aromatic N's, O's, and S's in various combinations. The usage of "a" in SMARTS is powerful in this regard. Thanks again. Regards, Jim Metz -Original Message- From: Greg Landrum <greg.land...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: Jason Biggs <jasondbi...@gmail.com>; RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Thu, Sep 21, 2017 12:32 am Subject: Re: [Rdkit-discuss] SMARTS for heteroaromatic rings? My approach to this would depend on what you're trying to accomplish in the end. If you just want all the aromatic atoms you can just use "[a]". Unless you do some extra work when you read in the molecules, any aromatic atom will be in a ring. If you want to be really sure, you can do "[a;r]" If you want all the aromatic bonds, it's "[a]:[a]" If you want the rings themselves and you want to just use SMARTS, you have to enumerate. Python makes getting the patterns pretty easy: In [8]: patts = ["[a]:1"+":[a]"*i+":[a]:1" for i in range(3,22)] # 24 is the max aromatic ring size In [9]: patts[:3] Out[9]: ['[a]:1:[a]:[a]:[a]:[a]:1', '[a]:1:[a]:[a]:[a]:[a]:[a]:1', '[a]:1:[a]:[a]:[a]:[a]:[a]:[a]:1'] The rest is just some calls to MolFromSmarts() and then mol.GetSubstructMatches() for the molecules you want to test. -greg On Thu, Sep 21, 2017 at 3:56 AM, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Jason, Thanks! I just thought of that for a 6-membered ring. A 5-membered ring would be [a]1[a][a][a][a]1. Hmmm... I was thinking of using "r" to specify a ring, but I don't think that would be necessary. Correct? Regards, Jim Metz -Original Message- From: Jason Biggs <jasondbi...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Wed, Sep 20, 2017 8:36 pm Subject: Re: [Rdkit-discuss] SMARTS for heteroaromatic rings? if you don't care what type of atom it is, just that it's aromatic, you should use [a], so [a]1[a][a][a][a][a]1 would match any 6-membered aromatic ring Jason Biggs On Wed, Sep 20, 2017 at 7:57 PM, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Hello, I would like to write a SMARTS that will match all of the individual atoms in all possible heteroaromatic rings. Does anyone know of an elegant, compact way to do this? If one SMARTS will not work, I can concatenate SMARTS using a vertical pipe, "|", as I proposed in an earlier message in this forum. I am (perhaps) expecting SMARTS something like [c]1[c][n][c][c]1 etc [c]1[c][c][c][c][c]1 [c]1[c][n][c][c][c]1 etc. Perhaps there is a very elegant way to specify the possible patterns. I can't think of a way to do it, other than exhaustive enumeration. Any ideas? Regards, Jim Metz -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS for heteroaromatic rings?
My approach to this would depend on what you're trying to accomplish in the end. If you just want all the aromatic atoms you can just use "[a]". Unless you do some extra work when you read in the molecules, any aromatic atom will be in a ring. If you want to be really sure, you can do "[a;r]" If you want all the aromatic bonds, it's "[a]:[a]" If you want the rings themselves and you want to just use SMARTS, you have to enumerate. Python makes getting the patterns pretty easy: In [8]: patts = ["[a]:1"+":[a]"*i+":[a]:1" for i in range(3,22)] # 24 is the max aromatic ring size In [9]: patts[:3] Out[9]: ['[a]:1:[a]:[a]:[a]:[a]:1', '[a]:1:[a]:[a]:[a]:[a]:[a]:1', '[a]:1:[a]:[a]:[a]:[a]:[a]:[a]:1'] The rest is just some calls to MolFromSmarts() and then mol.GetSubstructMatches() for the molecules you want to test. -greg On Thu, Sep 21, 2017 at 3:56 AM, James T. Metz via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Jason, > > Thanks! I just thought of that for a 6-membered ring. A 5-membered > ring would be [a]1[a][a][a][a]1. > > Hmmm... I was thinking of using "r" to specify a ring, but I don't > think > that would be necessary. Correct? > > Regards, > Jim Metz > > > > > -Original Message- > From: Jason Biggs <jasondbi...@gmail.com> > To: James T. Metz <jamestm...@aol.com> > Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> > Sent: Wed, Sep 20, 2017 8:36 pm > Subject: Re: [Rdkit-discuss] SMARTS for heteroaromatic rings? > > if you don't care what type of atom it is, just that it's aromatic, you > should use [a], > > so [a]1[a][a][a][a][a]1 would match any 6-membered aromatic ring > > Jason Biggs > > > On Wed, Sep 20, 2017 at 7:57 PM, James T. Metz via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> wrote: > > Hello, > > I would like to write a SMARTS that will match all of the individual > atoms > in all possible heteroaromatic rings. Does anyone know of an elegant, > compact way to do this? > > If one SMARTS will not work, I can concatenate SMARTS using > a vertical pipe, "|", as I proposed in an earlier message in this forum. > > I am (perhaps) expecting SMARTS something like > [c]1[c][n][c][c]1 > etc > [c]1[c][c][c][c][c]1 > [c]1[c][n][c][c][c]1 > etc. > > Perhaps there is a very elegant way to specify the possible > patterns. I can't think of a way to do it, other than exhaustive > enumeration. > > Any ideas? > > Regards, > Jim Metz > > > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] SMARTS for heteroaromatic rings?
Jason, Thanks! I just thought of that for a 6-membered ring. A 5-membered ring would be [a]1[a][a][a][a]1. Hmmm... I was thinking of using "r" to specify a ring, but I don't think that would be necessary. Correct? Regards, Jim Metz -Original Message- From: Jason Biggs <jasondbi...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Wed, Sep 20, 2017 8:36 pm Subject: Re: [Rdkit-discuss] SMARTS for heteroaromatic rings? if you don't care what type of atom it is, just that it's aromatic, you should use [a], so [a]1[a][a][a][a][a]1 would match any 6-membered aromatic ring Jason Biggs On Wed, Sep 20, 2017 at 7:57 PM, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Hello, I would like to write a SMARTS that will match all of the individual atoms in all possible heteroaromatic rings. Does anyone know of an elegant, compact way to do this? If one SMARTS will not work, I can concatenate SMARTS using a vertical pipe, "|", as I proposed in an earlier message in this forum. I am (perhaps) expecting SMARTS something like [c]1[c][n][c][c]1 etc [c]1[c][c][c][c][c]1 [c]1[c][n][c][c][c]1 etc. Perhaps there is a very elegant way to specify the possible patterns. I can't think of a way to do it, other than exhaustive enumeration. Any ideas? Regards, Jim Metz -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] SMARTS for heteroaromatic rings?
Hello, I would like to write a SMARTS that will match all of the individual atoms in all possible heteroaromatic rings. Does anyone know of an elegant, compact way to do this? If one SMARTS will not work, I can concatenate SMARTS using a vertical pipe, "|", as I proposed in an earlier message in this forum. I am (perhaps) expecting SMARTS something like [c]1[c][n][c][c]1 etc [c]1[c][c][c][c][c]1 [c]1[c][n][c][c][c]1 etc. Perhaps there is a very elegant way to specify the possible patterns. I can't think of a way to do it, other than exhaustive enumeration. Any ideas? Regards, Jim Metz -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss